HPX documentation#

Welcome to the HPX documentation!

If you’re new to HPX you can get started with the Quick start guide. Don’t forget to read the Terminology section to learn about the most important concepts in HPX. The Examples give you a feel for how it is to write real HPX applications and the Manual contains detailed information about everything from building HPX to debugging it. There are links to blog posts and videos about HPX in Additional material.

You can find a comprehensive list of contact options on Support for deploying and using HPX. Do not hesitate to contact us if you can’t find what you are looking for in the documentation!

See Citing HPX for details on how to cite HPX in publications. See HPX users for a list of institutions and projects using HPX.

There are also available a PDF version of this documentation as well as a Single HTML Page.

What is HPX?#

HPX is a C++ Standard Library for Concurrency and Parallelism. It implements all of the corresponding facilities as defined by the C++ Standard. Additionally, in HPX we implement functionalities proposed as part of the ongoing C++ standardization process. We also extend the C++ Standard APIs to the distributed case. HPX is developed by the STE||AR group (see People).

The goal of HPX is to create a high quality, freely available, open source implementation of a new programming model for conventional systems, such as classic Linux based Beowulf clusters or multi-socket highly parallel SMP nodes. At the same time, we want to have a very modular and well designed runtime system architecture which would allow us to port our implementation onto new computer system architectures. We want to use real-world applications to drive the development of the runtime system, coining out required functionalities and converging onto a stable API which will provide a smooth migration path for developers.

The API exposed by HPX is not only modeled after the interfaces defined by the C++11/14/17/20 ISO standard. It also adheres to the programming guidelines used by the Boost collection of C++ libraries. We aim to improve the scalability of today’s applications and to expose new levels of parallelism which are necessary to take advantage of the exascale systems of the future.

What’s so special about HPX?#

  • HPX exposes a uniform, standards-oriented API for ease of programming parallel and distributed applications.

  • It enables programmers to write fully asynchronous code using hundreds of millions of threads.

  • HPX provides unified syntax and semantics for local and remote operations.

  • HPX makes concurrency manageable with dataflow and future based synchronization.

  • It implements a rich set of runtime services supporting a broad range of use cases.

  • HPX exposes a uniform, flexible, and extendable performance counter framework which can enable runtime adaptivity

  • It is designed to solve problems conventionally considered to be scaling-impaired.

  • HPX has been designed and developed for systems of any scale, from hand-held devices to very large scale systems.

  • It is the first fully functional implementation of the ParalleX execution model.

  • HPX is published under a liberal open-source license and has an open, active, and thriving developer community.

Quick start#

The following steps will help you get started with HPX. Before getting started, make sure you have all the necessary prerequisites, which are listed in _prerequisites. After Installing HPX, you can check how to run a simple example Hello, World!. Writing task-based applications explains how you can get started with HPX. You can refer to our Migration guide if you use other APIs for parallelism (like OpenMP, MPI or Intel Threading Building Blocks (TBB)) and you would like to convert your code to HPX code.

Installing HPX#

The easiest way to install HPX on your system is by choosing one of the steps below:

  1. vcpkg

    You can download and install HPX using the vcpkg dependency manager:

    $ vcpkg install hpx
    
  2. Spack

    Another way to install HPX is using Spack:

    $ spack install hpx
    
  3. Fedora

    Installation can be done with Fedora as well:

    $ dnf install hpx*
    
  4. Arch Linux

    HPX is available in the Arch User Repository (AUR) as hpx too.

More information or alternatives regarding the installation can be found in the Building HPX, a detailed guide with thorough explanation of ways to build and use HPX.

Hello, World!#

To get started with this minimal example you need to create a new project directory and a file CMakeLists.txt with the contents below in order to build an executable using CMake and HPX:

cmake_minimum_required(VERSION 3.19)
project(my_hpx_project CXX)
find_package(HPX REQUIRED)
add_executable(my_hpx_program main.cpp)
target_link_libraries(my_hpx_program HPX::hpx HPX::wrap_main HPX::iostreams_component)

The next step is to create a main.cpp with the contents below:

// Including 'hpx/hpx_main.hpp' instead of the usual 'hpx/hpx_init.hpp' enables
// to use the plain C-main below as the direct main HPX entry point.
#include <hpx/hpx_main.hpp>
#include <hpx/iostream.hpp>

int main()
{
    // Say hello to the world!
    hpx::cout << "Hello World!\n" << std::flush;
    return 0;
}

Then, in your project directory run the following:

$ mkdir build && cd build
$ cmake -DHPX_DIR=</path/to/hpx/installation> ..
$ make all
$ ./my_hpx_program
$ ./my_hpx_program
Hello World!

The program looks almost like a regular C++ hello world with the exception of the two includes and hpx::cout.

  • When you include hpx_main.hpp HPX makes sure that main actually gets launched on the HPX runtime. So while it looks almost the same you can now use futures, async, parallel algorithms and more which make use of the HPX runtime with lightweight threads.

  • hpx::cout is a replacement for std::cout to make sure printing never blocks a lightweight thread. You can read more about hpx::cout in The HPX I/O-streams component.

Note

Caution

Ensure that HPX is installed with HPX_WITH_DISTRIBUTED_RUNTIME=ON to prevent encountering an error indicating that the HPX::iostreams_component target is not found.

When including hpx_main.hpp the user-defined main gets renamed and the real main function is defined by HPX. This means that the user-defined main must include a return statement, unlike the real main. If you do not include the return statement, you may end up with confusing compile time errors mentioning user_main or even runtime errors.

Writing task-based applications#

So far we haven’t done anything that can’t be done using the C++ standard library. In this section we will give a short overview of what you can do with HPX on a single node. The essence is to avoid global synchronization and break up your application into small, composable tasks whose dependencies control the flow of your application. Remember, however, that HPX allows you to write distributed applications similarly to how you would write applications for a single node (see Why HPX? and Writing distributed applications).

If you are already familiar with async and future from the C++ standard library, the same functionality is available in HPX.

The following terminology is essential when talking about task-based C++ programs:

  • lightweight thread: Essential for good performance with task-based programs. Lightweight refers to smaller stacks and faster context switching compared to OS threads. Smaller overheads allow the program to be broken up into smaller tasks, which in turns helps the runtime fully utilize all processing units.

  • async: The most basic way of launching tasks asynchronously. Returns a future<T>.

  • future<T>: Represents a value of type T that will be ready in the future. The value can be retrieved with get (blocking) and one can check if the value is ready with is_ready (non-blocking).

  • shared_future<T>: Same as future<T> but can be copied (similar to std::unique_ptr vs std::shared_ptr).

  • continuation: A function that is to be run after a previous task has run (represented by a future). then is a method of future<T> that takes a function to run next. Used to build up dataflow DAGs (directed acyclic graphs). shared_futures help you split up nodes in the DAG and functions like when_all help you join nodes in the DAG.

The following example is a collection of the most commonly used functionality in HPX:

#include <hpx/algorithm.hpp>
#include <hpx/future.hpp>
#include <hpx/init.hpp>

#include <iostream>
#include <random>
#include <vector>

void final_task(hpx::future<hpx::tuple<hpx::future<double>, hpx::future<void>>>)
{
    std::cout << "in final_task" << std::endl;
}

int hpx_main()
{
    // A function can be launched asynchronously. The program will not block
    // here until the result is available.
    hpx::future<int> f = hpx::async([]() { return 42; });
    std::cout << "Just launched a task!" << std::endl;

    // Use get to retrieve the value from the future. This will block this task
    // until the future is ready, but the HPX runtime will schedule other tasks
    // if there are tasks available.
    std::cout << "f contains " << f.get() << std::endl;

    // Let's launch another task.
    hpx::future<double> g = hpx::async([]() { return 3.14; });

    // Tasks can be chained using the then method. The continuation takes the
    // future as an argument.
    hpx::future<double> result = g.then([](hpx::future<double>&& gg) {
        // This function will be called once g is ready. gg is g moved
        // into the continuation.
        return gg.get() * 42.0 * 42.0;
    });

    // You can check if a future is ready with the is_ready method.
    std::cout << "Result is ready? " << result.is_ready() << std::endl;

    // You can launch other work in the meantime. Let's sort a vector.
    std::vector<int> v(1000000);

    // We fill the vector synchronously and sequentially.
    hpx::generate(hpx::execution::seq, std::begin(v), std::end(v), &std::rand);

    // We can launch the sort in parallel and asynchronously.
    hpx::future<void> done_sorting =
        hpx::sort(hpx::execution::par(          // In parallel.
                      hpx::execution::task),    // Asynchronously.
            std::begin(v), std::end(v));

    // We launch the final task when the vector has been sorted and result is
    // ready using when_all.
    auto all = hpx::when_all(result, done_sorting).then(&final_task);

    // We can wait for all to be ready.
    all.wait();

    // all must be ready at this point because we waited for it to be ready.
    std::cout << (all.is_ready() ? "all is ready!" : "all is not ready...")
              << std::endl;

    return hpx::local::finalize();
}

int main(int argc, char* argv[])
{
    return hpx::local::init(hpx_main, argc, argv);
}

Try copying the contents to your main.cpp file and look at the output. It can be a good idea to go through the program step by step with a debugger. You can also try changing the types or adding new arguments to functions to make sure you can get the types to match. The type of the then method can be especially tricky to get right (the continuation needs to take the future as an argument).

Note

HPX programs accept command line arguments. The most important one is --hpx:threads=N to set the number of OS threads used by HPX. HPX uses one thread per core by default. Play around with the example above and see what difference the number of threads makes on the sort function. See Launching and configuring HPX applications for more details on how and what options you can pass to HPX.

Tip

The example above used the construction hpx::when_all(...).then(...). For convenience and performance it is a good idea to replace uses of hpx::when_all(...).then(...) with dataflow. See Dataflow for more details on dataflow.

Tip

If possible, try to use the provided parallel algorithms instead of writing your own implementation. This can save you time and the resulting program is often faster.

Next steps#

If you haven’t done so already, reading the Terminology section will help you get familiar with the terms used in HPX.

The Examples section contains small, self-contained walkthroughs of example HPX programs. The Local to remote example is a thorough, realistic example starting from a single node implementation and going stepwise to a distributed implementation.

The Manual contains detailed information on writing, building and running HPX applications.

Examples#

The following sections analyze some examples to help you get familiar with the HPX style of programming. We start off with simple examples that utilize basic HPX elements and then begin to expose the reader to the more complex and powerful HPX concepts. Section Building tests and examples shows how you can build the examples.

Asynchronous execution#

The Fibonacci sequence is a sequence of numbers starting with 0 and 1 where every subsequent number is the sum of the previous two numbers. In this example, we will use HPX to calculate the value of the n-th element of the Fibonacci sequence. In order to compute this problem in parallel, we will use a facility known as a future.

As shown in the Fig. 1 below, a future encapsulates a delayed computation. It acts as a proxy for a result initially not known, most of the time because the computation of the result has not completed yet. The future synchronizes the access of this value by optionally suspending any HPX-threads requesting the result until the value is available. When a future is created, it spawns a new HPX-thread (either remotely with a parcel or locally by placing it into the thread queue) which, when run, will execute the function associated with the future. The arguments of the function are bound when the future is created.

_images/future_schematics.png

Fig. 1 Schematic of a future execution.#

Once the function has finished executing, a write operation is performed on the future. The write operation marks the future as completed, and optionally stores data returned by the function. When the result of the delayed computation is needed, a read operation is performed on the future. If the future’s function hasn’t completed when a read operation is performed on it, the reader HPX-thread is suspended until the future is ready. The future facility allows HPX to schedule work early in a program so that when the function value is needed it will already be calculated and available. We use this property in our Fibonacci example below to enable its parallel execution.

Setup#

The source code for this example can be found here: fibonacci_local.cpp.

To compile this program, go to your HPX build directory (see Building HPX for information on configuring and building HPX) and enter:

$ make examples.quickstart.fibonacci_local

To run the program type:

$ ./bin/fibonacci_local

This should print (time should be approximate):

fibonacci(10) == 55
elapsed time: 0.002430 [s]

This run used the default settings, which calculate the tenth element of the Fibonacci sequence. To declare which Fibonacci value you want to calculate, use the --n-value option. Additionally you can use the --hpx:threads option to declare how many OS-threads you wish to use when running the program. For instance, running:

$ ./bin/fibonacci --n-value 20 --hpx:threads 4

Will yield:

fibonacci(20) == 6765
elapsed time: 0.062854 [s]
Walkthrough#

Now that you have compiled and run the code, let’s look at how the code works. Since this code is written in C++, we will begin with the main() function. Here you can see that in HPX, main() is only used to initialize the runtime system. It is important to note that application-specific command line options are defined here. HPX uses Boost.Program_options for command line processing. You can see that our programs --n-value option is set by calling the add_options() method on an instance of hpx::program_options::options_description. The default value of the variable is set to 10. This is why when we ran the program for the first time without using the --n-value option the program returned the 10th value of the Fibonacci sequence. The constructor argument of the description is the text that appears when a user uses the --hpx:help option to see what command line options are available. HPX_APPLICATION_STRING is a macro that expands to a string constant containing the name of the HPX application currently being compiled.

In HPX main() is used to initialize the runtime system and pass the command line arguments to the program. If you wish to add command line options to your program you would add them here using the instance of the Boost class options_description, and invoking the public member function .add_options() (see Boost Documentation for more details). hpx::init calls hpx_main() after setting up HPX, which is where the logic of our program is encoded.

int main(int argc, char* argv[])
{
    // Configure application-specific options
    hpx::program_options::options_description desc_commandline(
        "Usage: " HPX_APPLICATION_STRING " [options]");

    // clang-format off
    desc_commandline.add_options()
        ("n-value",
            hpx::program_options::value<std::uint64_t>()->default_value(10),
            "n value for the Fibonacci function")
        ;
    // clang-format on

    // Initialize and run HPX
    hpx::local::init_params init_args;
    init_args.desc_cmdline = desc_commandline;

    return hpx::local::init(hpx_main, argc, argv, init_args);
}

The hpx::init function in main() starts the runtime system, and invokes hpx_main() as the first HPX-thread. Below we can see that the basic program is simple. The command line option --n-value is read in, a timer (hpx::chrono::high_resolution_timer) is set up to record the time it takes to do the computation, the fibonacci function is invoked synchronously, and the answer is printed out.

int hpx_main(hpx::program_options::variables_map& vm)
{
    hpx::threads::add_scheduler_mode(
        hpx::threads::policies::scheduler_mode::fast_idle_mode);

    // extract command line argument, i.e. fib(N)
    std::uint64_t n = vm["n-value"].as<std::uint64_t>();

    {
        // Keep track of the time required to execute.
        hpx::chrono::high_resolution_timer t;

        std::uint64_t r = fibonacci(n);

        char const* fmt = "fibonacci({1}) == {2}\nelapsed time: {3} [s]\n";
        hpx::util::format_to(std::cout, fmt, n, r, t.elapsed());
    }

    return hpx::local::finalize();    // Handles HPX shutdown
}

The fibonacci function itself is synchronous as the work done inside is asynchronous. To understand what is happening we have to look inside the fibonacci function:

std::uint64_t fibonacci(std::uint64_t n)
{
    if (n < 2)
        return n;

    hpx::future<std::uint64_t> n1 = hpx::async(fibonacci, n - 1);
    std::uint64_t n2 = fibonacci(n - 2);

    return n1.get() + n2;    // wait for the Future to return their values
}

This block of code looks similar to regular C++ code. First, if (n < 2), meaning n is 0 or 1, then we return 0 or 1 (recall the first element of the Fibonacci sequence is 0 and the second is 1). If n is larger than 1 we spawn two new tasks whose results are contained in n1 and n2. This is done using hpx::async which takes as arguments a function (function pointer, object or lambda) and the arguments to the function. Instead of returning a std::uint64_t like fibonacci does, hpx::async returns a future of a std::uint64_t, i.e. hpx::future<std::uint64_t>. Each of these futures represents an asynchronous, recursive call to fibonacci. After we’ve created the futures, we wait for both of them to finish computing, we add them together, and return that value as our result. We get the values from the futures using the get method. The recursive call tree will continue until n is equal to 0 or 1, at which point the value can be returned because it is implicitly known. When this termination condition is reached, the futures can then be added up, producing the n-th value of the Fibonacci sequence.

Note that calling get potentially blocks the calling HPX-thread, and lets other HPX-threads run in the meantime. There are, however, more efficient ways of doing this. examples/quickstart/fibonacci_futures.cpp contains many more variations of locally computing the Fibonacci numbers, where each method makes different tradeoffs in where asynchrony and parallelism is applied. To get started, however, the method above is sufficient and optimizations can be applied once you are more familiar with HPX. The example Dataflow presents dataflow, which is a way to more efficiently chain together multiple tasks.

Parallel algorithms#

This program will perform a matrix multiplication in parallel. The output will look something like this:

Matrix A is :
4 9 6
1 9 8

Matrix B is :
4 9
6 1
9 8

Resultant Matrix is :
124 93
130 82
Setup#

The source code for this example can be found here: matrix_multiplication.cpp.

To compile this program, go to your HPX build directory (see Building HPX for information on configuring and building HPX) and enter:

$ make examples.quickstart.matrix_multiplication

To run the program type:

$ ./bin/matrix_multiplication

or:

$ ./bin/matrix_multiplication --n 2 --m 3 --k 2 --s 100 --l 0 --u 10

where the first matrix is n x m and the second m x k, s is the seed for creating the random values of the matrices and the range of these values is [l,u]

This should print:

Matrix A is :
4 9 6
1 9 8

Matrix B is :
4 9
6 1
9 8

Resultant Matrix is :
124 93
130 82

Notice that the numbers may be different because of the random initialization of the matrices.

Walkthrough#

Now that you have compiled and run the code, let’s look at how the code works.

First, main() is used to initialize the runtime system and pass the command line arguments to the program. hpx::init calls hpx_main() after setting up HPX, which is where our program is implemented.

int main(int argc, char* argv[])
{
    using namespace hpx::program_options;
    options_description cmdline("usage: " HPX_APPLICATION_STRING " [options]");
    // clang-format off
    cmdline.add_options()
        ("n",
        hpx::program_options::value<std::size_t>()->default_value(2),
        "Number of rows of first matrix")
        ("m",
        hpx::program_options::value<std::size_t>()->default_value(3),
        "Number of columns of first matrix (equal to the number of rows of "
        "second matrix)")
        ("k",
        hpx::program_options::value<std::size_t>()->default_value(2),
        "Number of columns of second matrix")
        ("seed,s",
        hpx::program_options::value<unsigned int>(),
        "The random number generator seed to use for this run")
        ("l",
        hpx::program_options::value<int>()->default_value(0),
        "Lower limit of range of values")
        ("u",
        hpx::program_options::value<int>()->default_value(10),
        "Upper limit of range of values");
    // clang-format on
    hpx::local::init_params init_args;
    init_args.desc_cmdline = cmdline;

    return hpx::local::init(hpx_main, argc, argv, init_args);
}

Proceeding to the hpx_main() function, we can see that matrix multiplication can be done very easily.

int hpx_main(hpx::program_options::variables_map& vm)
{
    using element_type = int;

    // Define matrix sizes
    std::size_t const rowsA = vm["n"].as<std::size_t>();
    std::size_t const colsA = vm["m"].as<std::size_t>();
    std::size_t const rowsB = colsA;
    std::size_t const colsB = vm["k"].as<std::size_t>();
    std::size_t const rowsR = rowsA;
    std::size_t const colsR = colsB;

    // Initialize matrices A and B
    std::vector<int> A(rowsA * colsA);
    std::vector<int> B(rowsB * colsB);
    std::vector<int> R(rowsR * colsR);

    // Define seed
    unsigned int seed = std::random_device{}();
    if (vm.count("seed"))
        seed = vm["seed"].as<unsigned int>();

    gen.seed(seed);
    std::cout << "using seed: " << seed << std::endl;

    // Define range of values
    int const lower = vm["l"].as<int>();
    int const upper = vm["u"].as<int>();

    // Matrices have random values in the range [lower, upper]
    std::uniform_int_distribution<element_type> dis(lower, upper);
    auto generator = std::bind(dis, gen);
    hpx::ranges::generate(A, generator);
    hpx::ranges::generate(B, generator);

    // Perform matrix multiplication
    hpx::experimental::for_loop(hpx::execution::par, 0, rowsA, [&](auto i) {
        hpx::experimental::for_loop(0, colsB, [&](auto j) {
            R[i * colsR + j] = 0;
            hpx::experimental::for_loop(0, rowsB, [&](auto k) {
                R[i * colsR + j] += A[i * colsA + k] * B[k * colsB + j];
            });
        });
    });

    // Print all 3 matrices
    print_matrix(A, rowsA, colsA, "A");
    print_matrix(B, rowsB, colsB, "B");
    print_matrix(R, rowsR, colsR, "R");

    return hpx::local::finalize();
}

First, the dimensions of the matrices are defined. If they were not given as command-line arguments, their default values are 2 x 3 for the first matrix and 3 x 2 for the second. We use standard vectors to define the matrices to be multiplied as well as the resultant matrix.

To give some random initial values to our matrices, we use std::uniform_int_distribution. Then, std::bind() is used along with hpx::ranges::generate() to yield two matrices A and B, which contain values in the range of [0, 10] or in the range defined by the user at the command-line arguments. The seed to generate the values can also be defined by the user.

The next step is to perform the matrix multiplication in parallel. This can be done by just using an hpx::experimental::for_loop combined with a parallel execution policy hpx::execution::par as the outer loop of the multiplication. Note that the execution of hpx::experimental::for_loop without specifying an execution policy is equivalent to specifying hpx::execution::seq as the execution policy.

Finally, the matrices A, B that are multiplied as well as the resultant matrix R are printed using the following function.

void print_matrix(std::vector<int> const& M, std::size_t rows, std::size_t cols,
    char const* message)
{
    std::cout << "\nMatrix " << message << " is:" << std::endl;
    for (std::size_t i = 0; i < rows; i++)
    {
        for (std::size_t j = 0; j < cols; j++)
            std::cout << M[i * cols + j] << " ";
        std::cout << "\n";
    }
}

Asynchronous execution with actions#

This example extends the previous example by introducing actions: functions that can be run remotely. In this example, however, we will still only run the action locally. The mechanism to execute actions stays the same: hpx::async. Later examples will demonstrate running actions on remote localities (e.g. Remote execution with actions).

Setup#

The source code for this example can be found here: fibonacci.cpp.

To compile this program, go to your HPX build directory (see Building HPX for information on configuring and building HPX) and enter:

$ make examples.quickstart.fibonacci

To run the program type:

$ ./bin/fibonacci

This should print (time should be approximate):

fibonacci(10) == 55
elapsed time: 0.00186288 [s]

This run used the default settings, which calculate the tenth element of the Fibonacci sequence. To declare which Fibonacci value you want to calculate, use the --n-value option. Additionally you can use the --hpx:threads option to declare how many OS-threads you wish to use when running the program. For instance, running:

$ ./bin/fibonacci --n-value 20 --hpx:threads 4

Will yield:

fibonacci(20) == 6765
elapsed time: 0.233827 [s]
Walkthrough#

The code needed to initialize the HPX runtime is the same as in the previous example:

int main(int argc, char* argv[])
{
    // Configure application-specific options
    hpx::program_options::options_description desc_commandline(
        "Usage: " HPX_APPLICATION_STRING " [options]");

    desc_commandline.add_options()("n-value",
        hpx::program_options::value<std::uint64_t>()->default_value(10),
        "n value for the Fibonacci function");

    // Initialize and run HPX
    hpx::init_params init_args;
    init_args.desc_cmdline = desc_commandline;

    return hpx::init(argc, argv, init_args);
}

The hpx::init function in main() starts the runtime system, and invokes hpx_main() as the first HPX-thread. The command line option --n-value is read in, a timer (hpx::chrono::high_resolution_timer) is set up to record the time it takes to do the computation, the fibonacci action is invoked synchronously, and the answer is printed out.

int hpx_main(hpx::program_options::variables_map& vm)
{
    // extract command line argument, i.e. fib(N)
    std::uint64_t n = vm["n-value"].as<std::uint64_t>();

    {
        // Keep track of the time required to execute.
        hpx::chrono::high_resolution_timer t;

        // Wait for fib() to return the value
        fibonacci_action fib;
        std::uint64_t r = fib(hpx::find_here(), n);

        char const* fmt = "fibonacci({1}) == {2}\nelapsed time: {3} [s]\n";
        hpx::util::format_to(std::cout, fmt, n, r, t.elapsed());
    }

    return hpx::finalize();    // Handles HPX shutdown
}

Upon a closer look we see that we’ve created a std::uint64_t to store the result of invoking our fibonacci_action fib. This action will launch synchronously (as the work done inside of the action will be asynchronous itself) and return the result of the Fibonacci sequence. But wait, what is an action? And what is this fibonacci_action? For starters, an action is a wrapper for a function. By wrapping functions, HPX can send packets of work to different processing units. These vehicles allow users to calculate work now, later, or on certain nodes. The first argument to our action is the location where the action should be run. In this case, we just want to run the action on the machine that we are currently on, so we use hpx::find_here. To further understand this we turn to the code to find where fibonacci_action was defined:

// forward declaration of the Fibonacci function
std::uint64_t fibonacci(std::uint64_t n);

// This is to generate the required boilerplate we need for the remote
// invocation to work.
HPX_PLAIN_ACTION(fibonacci, fibonacci_action)

A plain action is the most basic form of action. Plain actions wrap simple global functions which are not associated with any particular object (we will discuss other types of actions in Components and actions). In this block of code the function fibonacci() is declared. After the declaration, the function is wrapped in an action in the declaration HPX_PLAIN_ACTION. This function takes two arguments: the name of the function that is to be wrapped and the name of the action that you are creating.

This picture should now start making sense. The function fibonacci() is wrapped in an action fibonacci_action, which was run synchronously but created asynchronous work, then returns a std::uint64_t representing the result of the function fibonacci(). Now, let’s look at the function fibonacci():

std::uint64_t fibonacci(std::uint64_t n)
{
    if (n < 2)
        return n;

    // We restrict ourselves to execute the Fibonacci function locally.
    hpx::id_type const locality_id = hpx::find_here();

    // Invoking the Fibonacci algorithm twice is inefficient.
    // However, we intentionally demonstrate it this way to create some
    // heavy workload.

    fibonacci_action fib;
    hpx::future<std::uint64_t> n1 = hpx::async(fib, locality_id, n - 1);
    hpx::future<std::uint64_t> n2 = hpx::async(fib, locality_id, n - 2);

    return n1.get() +
        n2.get();    // wait for the Futures to return their values
}

This block of code is much more straightforward and should look familiar from the previous example. First, if (n < 2), meaning n is 0 or 1, then we return 0 or 1 (recall the first element of the Fibonacci sequence is 0 and the second is 1). If n is larger than 1 we spawn two tasks using hpx::async. Each of these futures represents an asynchronous, recursive call to fibonacci. As previously we wait for both futures to finish computing, get the results, add them together, and return that value as our result. The recursive call tree will continue until n is equal to 0 or 1, at which point the value can be returned because it is implicitly known. When this termination condition is reached, the futures can then be added up, producing the n-th value of the Fibonacci sequence.

Remote execution with actions#

This program will print out a hello world message on every OS-thread on every locality. The output will look something like this:

hello world from OS-thread 1 on locality 0
hello world from OS-thread 1 on locality 1
hello world from OS-thread 0 on locality 0
hello world from OS-thread 0 on locality 1
Setup#

The source code for this example can be found here: hello_world_distributed.cpp.

To compile this program, go to your HPX build directory (see Building HPX for information on configuring and building HPX) and enter:

$ make examples.quickstart.hello_world_distributed

To run the program type:

$ ./bin/hello_world_distributed

This should print:

hello world from OS-thread 0 on locality 0

To use more OS-threads use the command line option --hpx:threads and type the number of threads that you wish to use. For example, typing:

$ ./bin/hello_world_distributed --hpx:threads 2

will yield:

hello world from OS-thread 1 on locality 0
hello world from OS-thread 0 on locality 0

Notice how the ordering of the two print statements will change with subsequent runs. To run this program on multiple localities please see the section How to use HPX applications with PBS.

Walkthrough#

Now that you have compiled and run the code, let’s look at how the code works, beginning with main():

// Here is the main entry point. By using the include 'hpx/hpx_main.hpp' HPX
// will invoke the plain old C-main() as its first HPX thread.
int main()
{
    // Get a list of all available localities.
    std::vector<hpx::id_type> localities = hpx::find_all_localities();

    // Reserve storage space for futures, one for each locality.
    std::vector<hpx::future<void>> futures;
    futures.reserve(localities.size());

    for (hpx::id_type const& node : localities)
    {
        // Asynchronously start a new task. The task is encapsulated in a
        // future, which we can query to determine if the task has
        // completed.
        typedef hello_world_foreman_action action_type;
        futures.push_back(hpx::async<action_type>(node));
    }

    // The non-callback version of hpx::wait_all takes a single parameter,
    // a vector of futures to wait on. hpx::wait_all only returns when
    // all of the futures have finished.
    hpx::wait_all(futures);
    return 0;
}

In this excerpt of the code we again see the use of futures. This time the futures are stored in a vector so that they can easily be accessed. hpx::wait_all is a family of functions that wait on for an std::vector<> of futures to become ready. In this piece of code, we are using the synchronous version of hpx::wait_all, which takes one argument (the std::vector<> of futures to wait on). This function will not return until all the futures in the vector have been executed.

In Asynchronous execution with actions we used hpx::find_here to specify the target of our actions. Here, we instead use hpx::find_all_localities, which returns an std::vector<> containing the identifiers of all the machines in the system, including the one that we are on.

As in Asynchronous execution with actions our futures are set using hpx::async<>. The hello_world_foreman_action is declared here:

// Define the boilerplate code necessary for the function 'hello_world_foreman'
// to be invoked as an HPX action.
HPX_PLAIN_ACTION(hello_world_foreman, hello_world_foreman_action)

Another way of thinking about this wrapping technique is as follows: functions (the work to be done) are wrapped in actions, and actions can be executed locally or remotely (e.g. on another machine participating in the computation).

Now it is time to look at the hello_world_foreman() function which was wrapped in the action above:

void hello_world_foreman()
{
    // Get the number of worker OS-threads in use by this locality.
    std::size_t const os_threads = hpx::get_os_thread_count();

    // Populate a set with the OS-thread numbers of all OS-threads on this
    // locality. When the hello world message has been printed on a particular
    // OS-thread, we will remove it from the set.
    std::set<std::size_t> attendance;
    for (std::size_t os_thread = 0; os_thread < os_threads; ++os_thread)
        attendance.insert(os_thread);

    // As long as there are still elements in the set, we must keep scheduling
    // HPX-threads. Because HPX features work-stealing task schedulers, we have
    // no way of enforcing which worker OS-thread will actually execute
    // each HPX-thread.
    while (!attendance.empty())
    {
        // Each iteration, we create a task for each element in the set of
        // OS-threads that have not said "Hello world". Each of these tasks
        // is encapsulated in a future.
        std::vector<hpx::future<std::size_t>> futures;
        futures.reserve(attendance.size());

        for (std::size_t worker : attendance)
        {
            // Asynchronously start a new task. The task is encapsulated in a
            // future that we can query to determine if the task has completed.
            //
            // We give the task a hint to run on a particular worker thread
            // (core) and suggest binding the scheduled thread to the given
            // core, but no guarantees are given by the scheduler that the task
            // will actually run on that worker thread. It will however try as
            // hard as possible to place the new task on the given worker
            // thread.
            hpx::execution::parallel_executor exec(
                hpx::threads::thread_priority::bound);

            hpx::threads::thread_schedule_hint hint(
                hpx::threads::thread_schedule_hint_mode::thread,
                static_cast<std::int16_t>(worker));

            futures.push_back(
                hpx::async(hpx::execution::experimental::with_hint(exec, hint),
                    hello_world_worker, worker));
        }

        // Wait for all of the futures to finish. The callback version of the
        // hpx::wait_each function takes two arguments: a vector of futures,
        // and a binary callback.  The callback takes two arguments; the first
        // is the index of the future in the vector, and the second is the
        // return value of the future. hpx::wait_each doesn't return until
        // all the futures in the vector have returned.
        hpx::spinlock mtx;
        hpx::wait_each(hpx::unwrapping([&](std::size_t t) {
            if (std::size_t(-1) != t)
            {
                std::lock_guard<hpx::spinlock> lk(mtx);
                attendance.erase(t);
            }
        }),
            futures);
    }
}

Now, before we discuss hello_world_foreman(), let’s talk about the hpx::wait_each function. The version of hpx::wait_each invokes a callback function provided by the user, supplying the callback function with the result of the future.

In hello_world_foreman(), an std::set<> called attendance keeps track of which OS-threads have printed out the hello world message. When the OS-thread prints out the statement, the future is marked as ready, and hpx::wait_each in hello_world_foreman(). If it is not executing on the correct OS-thread, it returns a value of -1, which causes hello_world_foreman() to leave the OS-thread id in attendance.

std::size_t hello_world_worker(std::size_t desired)
{
    // Returns the OS-thread number of the worker that is running this
    // HPX-thread.
    std::size_t current = hpx::get_worker_thread_num();
    if (current == desired)
    {
        // The HPX-thread has been run on the desired OS-thread.
        char const* msg = "hello world from OS-thread {1} on locality {2}\n";

        hpx::util::format_to(hpx::cout, msg, desired, hpx::get_locality_id())
            << std::flush;

        return desired;
    }

    // This HPX-thread has been run by the wrong OS-thread, make the foreman
    // try again by rescheduling it.
    return std::size_t(-1);
}

Because HPX features work stealing task schedulers, there is no way to guarantee that an action will be scheduled on a particular OS-thread. This is why we must use a guess-and-check approach.

Components and actions#

The accumulator examples demonstrate the use of components. Components are C++ classes that expose methods as a type of HPX action. These actions are called component actions. There are three examples: - accumulator - template accumulator - template function accumulator

Components are globally named, meaning that a component action can be called remotely (e.g., from another machine). There are two accumulator examples in HPX.

In the Asynchronous execution with actions and the Remote execution with actions, we introduced plain actions, which wrapped global functions. The target of a plain action is an identifier which refers to a particular machine involved in the computation. For plain actions, the target is the machine where the action will be executed.

Component actions, however, do not target machines. Instead, they target component instances. The instance may live on the machine that we’ve invoked the component action from, or it may live on another machine.

The components in these examples expose three different functions:

  • reset() - Resets the accumulator value to 0.

  • add(arg) - Adds arg to the accumulators value.

  • query() - Queries the value of the accumulator.

These examples create an instance of the (template or template function) accumulator, and then allow the user to enter commands at a prompt, which subsequently invoke actions on the accumulator instance.

Accumulator#
Setup#

The source code for this example can be found here: accumulator_client.cpp.

To compile this program, go to your HPX build directory (see Building HPX for information on configuring and building HPX) and enter:

$ make examples.accumulators.accumulator

To run the program type:

$ ./bin/accumulator_client

Once the program starts running, it will print the following prompt and then wait for input. An example session is given below:

commands: reset, add [amount], query, help, quit
> add 5
> add 10
> query
15
> add 2
> query
17
> reset
> add 1
> query
1
> quit
Walkthrough#

Now, let’s take a look at the source code of the accumulator example. This example consists of two parts: an HPX component library (a library that exposes an HPX component) and a client application which uses the library. This walkthrough will cover the HPX component library. The code for the client application can be found here: accumulator_client.cpp.

An HPX component is represented by two C++ classes:

  • A server class - The implementation of the component’s functionality.

  • A client class - A high-level interface that acts as a proxy for an instance of the component.

Typically, these two classes both have the same name, but the server class usually lives in different sub-namespaces (server). For example, the full names of the two classes in accumulator are:

  • examples::server::accumulator (server class)

  • examples::accumulator (client class)

The server class#

The following code is from server/accumulator.hpp.

All HPX component server classes must inherit publicly from the HPX component base class: hpx::components::component_base

The accumulator component inherits from hpx::components::locking_hook. This allows the runtime system to ensure that all action invocations are serialized. That means that the system ensures that no two actions are invoked at the same time on a given component instance. This makes the component thread safe and no additional locking has to be implemented by the user. Moreover, an accumulator component is a component because it also inherits from hpx::components::component_base (the template argument passed to locking_hook is used as its base class). The following snippet shows the corresponding code:

    class accumulator
      : public hpx::components::locking_hook<
            hpx::components::component_base<accumulator>>

Our accumulator class will need a data member to store its value in, so let’s declare a data member:

        argument_type value_;

The constructor for this class simply initializes value_ to 0:

        accumulator()
          : value_(0)
        {
        }

Next, let’s look at the three methods of this component that we will be exposing as component actions:

Here are the action types. These types wrap the methods we’re exposing. The wrapping technique is very similar to the one used in the Asynchronous execution with actions and the Remote execution with actions:

        HPX_DEFINE_COMPONENT_ACTION(accumulator, reset)
        HPX_DEFINE_COMPONENT_ACTION(accumulator, add)
        HPX_DEFINE_COMPONENT_ACTION(accumulator, query)

The last piece of code in the server class header is the declaration of the action type registration code:

HPX_REGISTER_ACTION_DECLARATION(
    examples::server::accumulator::reset_action, accumulator_reset_action)

HPX_REGISTER_ACTION_DECLARATION(
    examples::server::accumulator::add_action, accumulator_add_action)

HPX_REGISTER_ACTION_DECLARATION(
    examples::server::accumulator::query_action, accumulator_query_action)

Note

The code above must be placed in the global namespace.

The rest of the registration code is in accumulator.cpp

///////////////////////////////////////////////////////////////////////////////
// Add factory registration functionality.
HPX_REGISTER_COMPONENT_MODULE()

///////////////////////////////////////////////////////////////////////////////
typedef hpx::components::component<examples::server::accumulator>
    accumulator_type;

HPX_REGISTER_COMPONENT(accumulator_type, accumulator)

///////////////////////////////////////////////////////////////////////////////
// Serialization support for accumulator actions.
HPX_REGISTER_ACTION(
    accumulator_type::wrapped_type::reset_action, accumulator_reset_action)
HPX_REGISTER_ACTION(
    accumulator_type::wrapped_type::add_action, accumulator_add_action)
HPX_REGISTER_ACTION(
    accumulator_type::wrapped_type::query_action, accumulator_query_action)

Note

The code above must be placed in the global namespace.

The client class#

The following code is from accumulator.hpp

The client class is the primary interface to a component instance. Client classes are used to create components:

// Create a component on this locality.
examples::accumulator c = hpx::new_<examples::accumulator>(hpx::find_here());

and to invoke component actions:

c.add(hpx::launch::apply, 4);

Clients, like servers, need to inherit from a base class, this time, hpx::components::client_base:

    class accumulator
      : public hpx::components::client_base<accumulator, server::accumulator>

For readability, we typedef the base class like so:

        typedef hpx::components::client_base<accumulator, server::accumulator>
            base_type;

Here are examples of how to expose actions through a client class:

There are a few different ways of invoking actions:

  • Non-blocking: For actions that don’t have return types, or when we do not care about the result of an action, we can invoke the action using fire-and-forget semantics. This means that once we have asked HPX to compute the action, we forget about it completely and continue with our computation. We use hpx::post to invoke an action in a non-blocking fashion.

        void reset(hpx::launch::apply_policy)
        {
            HPX_ASSERT(this->get_id());

            typedef server::accumulator::reset_action action_type;
            hpx::post(action_type(), this->get_id());
        }
        hpx::future<argument_type> query(hpx::launch::async_policy)
        {
            HPX_ASSERT(this->get_id());

            typedef server::accumulator::query_action action_type;
            return hpx::async(action_type(), this->get_id());
        }
  • Synchronous: To invoke an action in a fully synchronous manner, we can simply call hpx::sync which is semantically equivalent to hpx::async().get() (i.e., create a future and immediately wait on it to be ready). Here’s an example from the accumulator client class:

        void add(argument_type arg)
        {
            HPX_ASSERT(this->get_id());

            typedef server::accumulator::add_action action_type;
            action_type()(this->get_id(), arg);
        }

Note that this->get_id() references a data member of the hpx::components::client_base base class which identifies the server accumulator instance.

hpx::id_type is a type which represents a global identifier in HPX. This type specifies the target of an action. This is the type that is returned by hpx::find_here in which case it represents the locality the code is running on.

Template accumulator#
Walkthrough#
The server class#

The following code is from server/template_accumulator.hpp.

Similarly to the accumulator example, the component server class inherits publicly from hpx::components::component_base and from hpx::components::locking_hook ensuring thread-safe method invocations.

    template <typename T>
    class template_accumulator
      : public hpx::components::locking_hook<
            hpx::components::component_base<template_accumulator<T>>>

The body of the template accumulator class remains mainly the same as the accumulator with the difference that it uses templates in the data types.

        typedef T argument_type;

        template_accumulator()
          : value_(0)
        {
        }

        ///////////////////////////////////////////////////////////////////////
        // Exposed functionality of this component.

        /// Reset the components value to 0.
        void reset()
        {
            //  set value_ to 0.
            value_ = 0;
        }

        /// Add the given number to the accumulator.
        void add(argument_type arg)
        {
            //  add value_ to arg, and store the result in value_.
            value_ += arg;
        }

        /// Return the current value to the caller.
        argument_type query() const
        {
            // Get the value of value_.
            return value_;
        }

        ///////////////////////////////////////////////////////////////////////
        // Each of the exposed functions needs to be encapsulated into an
        // action type, generating all required boilerplate code for threads,
        // serialization, etc.
        HPX_DEFINE_COMPONENT_ACTION(template_accumulator, reset)
        HPX_DEFINE_COMPONENT_ACTION(template_accumulator, add)
        HPX_DEFINE_COMPONENT_ACTION(template_accumulator, query)

The last piece of code in the server class header is the declaration of the action type registration code. REGISTER_TEMPLATE_ACCUMULATOR_DECLARATION(type) declares actions for the specified type, while REGISTER_TEMPLATE_ACCUMULATOR(type) registers the actions and the component for the specified type, using macros to handle boilerplate code.

#define REGISTER_TEMPLATE_ACCUMULATOR_DECLARATION(type)                        \
    HPX_REGISTER_ACTION_DECLARATION(                                           \
        examples::server::template_accumulator<type>::reset_action,            \
        HPX_PP_CAT(__template_accumulator_reset_action_, type))                \
                                                                               \
    HPX_REGISTER_ACTION_DECLARATION(                                           \
        examples::server::template_accumulator<type>::add_action,              \
        HPX_PP_CAT(__template_accumulator_add_action_, type))                  \
                                                                               \
    HPX_REGISTER_ACTION_DECLARATION(                                           \
        examples::server::template_accumulator<type>::query_action,            \
        HPX_PP_CAT(__template_accumulator_query_action_, type))                \
    /**/

#define REGISTER_TEMPLATE_ACCUMULATOR(type)                                    \
    HPX_REGISTER_ACTION(                                                       \
        examples::server::template_accumulator<type>::reset_action,            \
        HPX_PP_CAT(__template_accumulator_reset_action_, type))                \
                                                                               \
    HPX_REGISTER_ACTION(                                                       \
        examples::server::template_accumulator<type>::add_action,              \
        HPX_PP_CAT(__template_accumulator_add_action_, type))                  \
                                                                               \
    HPX_REGISTER_ACTION(                                                       \
        examples::server::template_accumulator<type>::query_action,            \
        HPX_PP_CAT(__template_accumulator_query_action_, type))                \
                                                                               \
    typedef ::hpx::components::component<                                      \
        examples::server::template_accumulator<type>>                          \
        HPX_PP_CAT(__template_accumulator_, type);                             \
    HPX_REGISTER_COMPONENT(HPX_PP_CAT(__template_accumulator_, type))          \
    /**/

Note

The code above must be placed in the global namespace.

Finally, HPX_REGISTER_COMPONENT_MODULE() in file server/template_accumulator.cpp adds the factory registration functionality.

The client class#

The client class of the template accumulator can be found in template_accumulator.hpp and is very similar to the client class of the accumulator with the only difference that it uses templates and hence can work with different types.

Template function accumulator#
Walkthrough#
The server class#

The following code is from server/template_function_accumulator.hpp.

The component server class inherits publicly from hpx::components::component_base.

    class template_function_accumulator
      : public hpx::components::component_base<template_function_accumulator>

typedef hpx::spinlock mutex_type defines a mutex_type as hpx::spinlock for thread safety, while the code that follows exposes the functionality of this component.

        ///////////////////////////////////////////////////////////////////////
        // Exposed functionality of this component.

        /// Reset the value to 0.
        void reset()
        {
            // Atomically set value_ to 0.
            std::lock_guard<mutex_type> l(mtx_);
            value_ = 0;
        }

        /// Add the given number to the accumulator.
        template <typename T>
        void add(T arg)
        {
            // Atomically add value_ to arg, and store the result in value_.
            std::lock_guard<mutex_type> l(mtx_);
            value_ += static_cast<double>(arg);
        }

        /// Return the current value to the caller.
        double query() const
        {
            // Get the value of value_.
            std::lock_guard<mutex_type> l(mtx_);
            return value_;
        }
  • reset(): Resets the accumulator value to 0 in a thread-safe manner using std::lock_guard.

  • add(): Adds a value to the accumulator, allowing any type T that can be cast to double.

  • query(): Returns the current value of the accumulator in a thread-safe manner.

To define the actions for reset() and query() we can use the macro HPX_DEFINE_COMPONENT_ACTION. However, actions with template arguments require special type definitions. Therefore, we use make_action() to define add().

        ///////////////////////////////////////////////////////////////////////
        // Each of the exposed functions needs to be encapsulated into an
        // action type, generating all required boilerplate code for threads,
        // serialization, etc.

        HPX_DEFINE_COMPONENT_ACTION(template_function_accumulator, reset)
        HPX_DEFINE_COMPONENT_ACTION(template_function_accumulator, query)

        // Actions with template arguments (see add<>() above) require special
        // type definitions. The simplest way to define such an action type is
        // by deriving from the HPX facility make_action.
        template <typename T>
        struct add_action
          : hpx::actions::make_action<void (template_function_accumulator::*)(
                                          T),
                &template_function_accumulator::template add<T>,
                add_action<T>>::type
        {
        };

The last piece of code in the server class header is the action registration:

HPX_REGISTER_ACTION_DECLARATION(
    examples::server::template_function_accumulator::reset_action,
    managed_accumulator_reset_action)

HPX_REGISTER_ACTION_DECLARATION(
    examples::server::template_function_accumulator::query_action,
    managed_accumulator_query_action)

Note

The code above must be placed in the global namespace.

The rest of the registration code is in accumulator.cpp

///////////////////////////////////////////////////////////////////////////////
// Add factory registration functionality.
HPX_REGISTER_COMPONENT_MODULE()

///////////////////////////////////////////////////////////////////////////////
typedef hpx::components::component<
    examples::server::template_function_accumulator>
    accumulator_type;

HPX_REGISTER_COMPONENT(accumulator_type, template_function_accumulator)

///////////////////////////////////////////////////////////////////////////////
// Serialization support for managed_accumulator actions.
HPX_REGISTER_ACTION(accumulator_type::wrapped_type::reset_action,
    managed_accumulator_reset_action)
HPX_REGISTER_ACTION(accumulator_type::wrapped_type::query_action,
    managed_accumulator_query_action)

Note

The code above must be placed in the global namespace.

The client class#

The client class of the template accumulator can be found in template_function_accumulator.hpp and is very similar to the client class of the accumulator with the only difference that it uses templates and hence can work with different types.

Dataflow#

HPX provides its users with several different tools to simply express parallel concepts. One of these tools is a local control object (LCO) called dataflow. An LCO is a type of component that can spawn a new thread when triggered. They are also distinguished from other components by a standard interface that allow users to understand and use them easily. A Dataflow, being an LCO, is triggered when the values it depends on become available. For instance, if you have a calculation X that depends on the results of three other calculations, you could set up a dataflow that would begin the calculation X as soon as the other three calculations have returned their values. Dataflows are set up to depend on other dataflows. It is this property that makes dataflow a powerful parallelization tool. If you understand the dependencies of your calculation, you can devise a simple algorithm that sets up a dependency tree to be executed. In this example, we calculate compound interest. To calculate compound interest, one must calculate the interest made in each compound period, and then add that interest back to the principal before calculating the interest made in the next period. A practical person would, of course, use the formula for compound interest:

\[F = P(1 + i) ^ n\]

where \(F\) is the future value, \(P\) is the principal value, \(i\) is the interest rate, and \(n\) is the number of compound periods.

However, for the sake of this example, we have chosen to manually calculate the future value by iterating:

\[I = Pi\]

and

\[P = P + I\]
Setup#

The source code for this example can be found here: interest_calculator.cpp.

To compile this program, go to your HPX build directory (see Building HPX for information on configuring and building HPX) and enter:

$ make examples.quickstart.interest_calculator

To run the program type:

$ ./bin/interest_calculator --principal 100 --rate 5 --cp 6 --time 36
Final amount: 134.01
Amount made: 34.0096
Walkthrough#

Let us begin with main. Here we can see that we again are using Boost.Program_options to set our command line variables (see Asynchronous execution with actions for more details). These options set the principal, rate, compound period, and time. It is important to note that the units of time for cp and time must be the same.

int main(int argc, char** argv)
{
    options_description cmdline("Usage: " HPX_APPLICATION_STRING " [options]");

    cmdline.add_options()("principal", value<double>()->default_value(1000),
        "The principal [$]")("rate", value<double>()->default_value(7),
        "The interest rate [%]")("cp", value<int>()->default_value(12),
        "The compound period [months]")("time",
        value<int>()->default_value(12 * 30),
        "The time money is invested [months]");

    hpx::init_params init_args;
    init_args.desc_cmdline = cmdline;

    return hpx::init(argc, argv, init_args);
}

Next we look at hpx_main.

int hpx_main(variables_map& vm)
{
    {
        using hpx::dataflow;
        using hpx::make_ready_future;
        using hpx::shared_future;
        using hpx::unwrapping;
        hpx::id_type here = hpx::find_here();

        double init_principal =
            vm["principal"].as<double>();              //Initial principal
        double init_rate = vm["rate"].as<double>();    //Interest rate
        int cp = vm["cp"].as<int>();     //Length of a compound period
        int t = vm["time"].as<int>();    //Length of time money is invested

        init_rate /= 100;    //Rate is a % and must be converted
        t /= cp;    //Determine how many times to iterate interest calculation:
        //How many full compound periods can fit in the time invested

        // In non-dataflow terms the implemented algorithm would look like:
        //
        // int t = 5;    // number of time periods to use
        // double principal = init_principal;
        // double rate = init_rate;
        //
        // for (int i = 0; i < t; ++i)
        // {
        //     double interest = calc(principal, rate);
        //     principal = add(principal, interest);
        // }
        //
        // Please note the similarity with the code below!

        shared_future<double> principal = make_ready_future(init_principal);
        shared_future<double> rate = make_ready_future(init_rate);

        for (int i = 0; i < t; ++i)
        {
            shared_future<double> interest =
                dataflow(unwrapping(calc), principal, rate);
            principal = dataflow(unwrapping(add), principal, interest);
        }

        // wait for the dataflow execution graph to be finished calculating our
        // overall interest
        double result = principal.get();

        std::cout << "Final amount: " << result << std::endl;
        std::cout << "Amount made: " << result - init_principal << std::endl;
    }

    return hpx::finalize();
}

Here we find our command line variables read in, the rate is converted from a percent to a decimal, the number of calculation iterations is determined, and then our shared_futures are set up. Notice that we first place our principal and rate into shares futures by passing the variables init_principal and init_rate using hpx::make_ready_future.

In this way hpx::shared_future<double> principal and rate will be initialized to init_principal and init_rate when hpx::make_ready_future<double> returns a future containing those initial values. These shared futures then enter the for loop and are passed to interest. Next principal and interest are passed to the reassignment of principal using a hpx::dataflow. A dataflow will first wait for its arguments to be ready before launching any callbacks, so add in this case will not begin until both principal and interest are ready. This loop continues for each compound period that must be calculated. To see how interest and principal are calculated in the loop, let us look at calc_action and add_action:

// Calculate interest for one period
double calc(double principal, double rate)
{
    return principal * rate;
}

///////////////////////////////////////////////////////////////////////////////
// Add the amount made to the principal
double add(double principal, double interest)
{
    return principal + interest;
}

After the shared future dependencies have been defined in hpx_main, we see the following statement:

double result = principal.get();

This statement calls hpx::future::get on the shared future principal which had its value calculated by our for loop. The program will wait here until the entire dataflow tree has been calculated and the value assigned to result. The program then prints out the final value of the investment and the amount of interest made by subtracting the final value of the investment from the initial value of the investment.

Local to remote#

When developers write code they typically begin with a simple serial code and build upon it until all of the required functionality is present. The following set of examples were developed to demonstrate this iterative process of evolving a simple serial program to an efficient, fully-distributed HPX application. For this demonstration, we implemented a 1D heat distribution problem. This calculation simulates the diffusion of heat across a ring from an initialized state to some user-defined point in the future. It does this by breaking each portion of the ring into discrete segments and using the current segment’s temperature and the temperature of the surrounding segments to calculate the temperature of the current segment in the next timestep as shown by Fig. 2 below.

_images/1d_stencil_program_flow.png

Fig. 2 Heat diffusion example program flow.#

We parallelize this code over the following eight examples:

The first example is straight serial code. In this code we instantiate a vector U that contains two vectors of doubles as seen in the structure stepper.

struct stepper
{
    // Our partition type
    typedef double partition;

    // Our data for one time step
    typedef std::vector<partition> space;

    // Our operator
    static double heat(double left, double middle, double right)
    {
        return middle + (k * dt / (dx * dx)) * (left - 2 * middle + right);
    }

    // do all the work on 'nx' data points for 'nt' time steps
    space do_work(std::size_t nx, std::size_t nt)
    {
        // U[t][i] is the state of position i at time t.
        std::vector<space> U(2);
        for (space& s : U)
            s.resize(nx);

        // Initial conditions: f(0, i) = i
        for (std::size_t i = 0; i != nx; ++i)
            U[0][i] = double(i);

        // Actual time step loop
        for (std::size_t t = 0; t != nt; ++t)
        {
            space const& current = U[t % 2];
            space& next = U[(t + 1) % 2];

            next[0] = heat(current[nx - 1], current[0], current[1]);

            for (std::size_t i = 1; i != nx - 1; ++i)
                next[i] = heat(current[i - 1], current[i], current[i + 1]);

            next[nx - 1] = heat(current[nx - 2], current[nx - 1], current[0]);
        }

        // Return the solution at time-step 'nt'.
        return U[nt % 2];
    }
};

Each element in the vector of doubles represents a single grid point. To calculate the change in heat distribution, the temperature of each grid point, along with its neighbors, is passed to the function heat. In order to improve readability, references named current and next are created which, depending on the time step, point to the first and second vector of doubles. The first vector of doubles is initialized with a simple heat ramp. After calling the heat function with the data in the current vector, the results are placed into the next vector.

In example 2 we employ a technique called futurization. Futurization is a method by which we can easily transform a code that is serially executed into a code that creates asynchronous threads. In the simplest case this involves replacing a variable with a future to a variable, a function with a future to a function, and adding a .get() at the point where a value is actually needed. The code below shows how this technique was applied to the struct stepper.

struct stepper
{
    // Our partition type
    typedef hpx::shared_future<double> partition;

    // Our data for one time step
    typedef std::vector<partition> space;

    // Our operator
    static double heat(double left, double middle, double right)
    {
        return middle + (k * dt / (dx * dx)) * (left - 2 * middle + right);
    }

    // do all the work on 'nx' data points for 'nt' time steps
    hpx::future<space> do_work(std::size_t nx, std::size_t nt)
    {
        using hpx::dataflow;
        using hpx::unwrapping;

        // U[t][i] is the state of position i at time t.
        std::vector<space> U(2);
        for (space& s : U)
            s.resize(nx);

        // Initial conditions: f(0, i) = i
        for (std::size_t i = 0; i != nx; ++i)
            U[0][i] = hpx::make_ready_future(double(i));

        auto Op = unwrapping(&stepper::heat);

        // Actual time step loop
        for (std::size_t t = 0; t != nt; ++t)
        {
            space const& current = U[t % 2];
            space& next = U[(t + 1) % 2];

            // WHEN U[t][i-1], U[t][i], and U[t][i+1] have been computed, THEN we
            // can compute U[t+1][i]
            for (std::size_t i = 0; i != nx; ++i)
            {
                next[i] =
                    dataflow(hpx::launch::async, Op, current[idx(i, -1, nx)],
                        current[i], current[idx(i, +1, nx)]);
            }
        }

        // Now the asynchronous computation is running; the above for-loop does not
        // wait on anything. There is no implicit waiting at the end of each timestep;
        // the computation of each U[t][i] will begin as soon as its dependencies
        // are ready and hardware is available.

        // Return the solution at time-step 'nt'.
        return hpx::when_all(U[nt % 2]);
    }
};

In example 2, we redefine our partition type as a shared_future and, in main, create the object result, which is a future to a vector of partitions. We use result to represent the last vector in a string of vectors created for each timestep. In order to move to the next timestep, the values of a partition and its neighbors must be passed to heat once the futures that contain them are ready. In HPX, we have an LCO (Local Control Object) named Dataflow that assists the programmer in expressing this dependency. Dataflow allows us to pass the results of a set of futures to a specified function when the futures are ready. Dataflow takes three types of arguments, one which instructs the dataflow on how to perform the function call (async or sync), the function to call (in this case Op), and futures to the arguments that will be passed to the function. When called, dataflow immediately returns a future to the result of the specified function. This allows users to string dataflows together and construct an execution tree.

After the values of the futures in dataflow are ready, the values must be pulled out of the future container to be passed to the function heat. In order to do this, we use the HPX facility unwrapping, which underneath calls .get() on each of the futures so that the function heat will be passed doubles and not futures to doubles.

By setting up the algorithm this way, the program will be able to execute as quickly as the dependencies of each future are met. Unfortunately, this example runs terribly slow. This increase in execution time is caused by the overheads needed to create a future for each data point. Because the work done within each call to heat is very small, the overhead of creating and scheduling each of the three futures is greater than that of the actual useful work! In order to amortize the overheads of our synchronization techniques, we need to be able to control the amount of work that will be done with each future. We call this amount of work per overhead grain size.

In example 3, we return to our serial code to figure out how to control the grain size of our program. The strategy that we employ is to create “partitions” of data points. The user can define how many partitions are created and how many data points are contained in each partition. This is accomplished by creating the struct partition, which contains a member object data_, a vector of doubles that holds the data points assigned to a particular instance of partition.

In example 4, we take advantage of the partition setup by redefining space to be a vector of shared_futures with each future representing a partition. In this manner, each future represents several data points. Because the user can define how many data points are in each partition, and, therefore, how many data points are represented by one future, a user can control the grainsize of the simulation. The rest of the code is then futurized in the same manner as example 2. It should be noted how strikingly similar example 4 is to example 2.

Example 4 finally shows good results. This code scales equivalently to the OpenMP version. While these results are promising, there are more opportunities to improve the application’s scalability. Currently, this code only runs on one locality, but to get the full benefit of HPX, we need to be able to distribute the work to other machines in a cluster. We begin to add this functionality in example 5.

In order to run on a distributed system, a large amount of boilerplate code must be added. Fortunately, HPX provides us with the concept of a component, which saves us from having to write quite as much code. A component is an object that can be remotely accessed using its global address. Components are made of two parts: a server and a client class. While the client class is not required, abstracting the server behind a client allows us to ensure type safety instead of having to pass around pointers to global objects. Example 5 renames example 4’s struct partition to partition_data and adds serialization support. Next, we add the server side representation of the data in the structure partition_server. Partition_server inherits from hpx::components::component_base, which contains a server-side component boilerplate. The boilerplate code allows a component’s public members to be accessible anywhere on the machine via its Global Identifier (GID). To encapsulate the component, we create a client side helper class. This object allows us to create new instances of our component and access its members without having to know its GID. In addition, we are using the client class to assist us with managing our asynchrony. For example, our client class partition‘s member function get_data() returns a future to partition_data get_data(). This struct inherits its boilerplate code from hpx::components::client_base.

In the structure stepper, we have also had to make some changes to accommodate a distributed environment. In order to get the data from a particular neighboring partition, which could be remote, we must retrieve the data from all of the neighboring partitions. These retrievals are asynchronous and the function heat_part_data, which, amongst other things, calls heat, should not be called unless the data from the neighboring partitions have arrived. Therefore, it should come as no surprise that we synchronize this operation with another instance of dataflow (found in heat_part). This dataflow receives futures to the data in the current and surrounding partitions by calling get_data() on each respective partition. When these futures are ready, dataflow passes them to the unwrapping function, which extracts the shared_array of doubles and passes them to the lambda. The lambda calls heat_part_data on the locality, which the middle partition is on.

Although this example could run distributed, it only runs on one locality, as it always uses hpx::find_here() as the target for the functions to run on.

In example 6, we begin to distribute the partition data on different nodes. This is accomplished in stepper::do_work() by passing the GID of the locality where we wish to create the partition to the partition constructor.

    for (std::size_t i = 0; i != np; ++i)
        U[0][i] = partition(localities[locidx(i, np, nl)], nx, double(i));

We distribute the partitions evenly based on the number of localities used, which is described in the function locidx. Because some of the data needed to update the partition in heat_part could now be on a new locality, we must devise a way of moving data to the locality of the middle partition. We accomplished this by adding a switch in the function get_data() that returns the end element of the buffer data_ if it is from the left partition or the first element of the buffer if the data is from the right partition. In this way only the necessary elements, not the whole buffer, are exchanged between nodes. The reader should be reminded that this exchange of end elements occurs in the function get_data() and, therefore, is executed asynchronously.

Now that we have the code running in distributed, it is time to make some optimizations. The function heat_part spends most of its time on two tasks: retrieving remote data and working on the data in the middle partition. Because we know that the data for the middle partition is local, we can overlap the work on the middle partition with that of the possibly remote call of get_data(). This algorithmic change, which was implemented in example 7, can be seen below:

    // The partitioned operator, it invokes the heat operator above on all elements
    // of a partition.
    static partition heat_part(
        partition const& left, partition const& middle, partition const& right)
    {
        using hpx::dataflow;
        using hpx::unwrapping;

        hpx::shared_future<partition_data> middle_data =
            middle.get_data(partition_server::middle_partition);

        hpx::future<partition_data> next_middle = middle_data.then(
            unwrapping([middle](partition_data const& m) -> partition_data {
                HPX_UNUSED(middle);

                // All local operations are performed once the middle data of
                // the previous time step becomes available.
                std::size_t size = m.size();
                partition_data next(size);
                for (std::size_t i = 1; i != size - 1; ++i)
                    next[i] = heat(m[i - 1], m[i], m[i + 1]);
                return next;
            }));

        return dataflow(hpx::launch::async,
            unwrapping([left, middle, right](partition_data next,
                           partition_data const& l, partition_data const& m,
                           partition_data const& r) -> partition {
                HPX_UNUSED(left);
                HPX_UNUSED(right);

                // Calculate the missing boundary elements once the
                // corresponding data has become available.
                std::size_t size = m.size();
                next[0] = heat(l[size - 1], m[0], m[1]);
                next[size - 1] = heat(m[size - 2], m[size - 1], r[0]);

                // The new partition_data will be allocated on the same locality
                // as 'middle'.
                return partition(middle.get_id(), std::move(next));
            }),
            std::move(next_middle),
            left.get_data(partition_server::left_partition), middle_data,
            right.get_data(partition_server::right_partition));
    }

Example 8 completes the futurization process and utilizes the full potential of HPX by distributing the program flow to multiple localities, usually defined as nodes in a cluster. It accomplishes this task by running an instance of HPX main on each locality. In order to coordinate the execution of the program, the struct stepper is wrapped into a component. In this way, each locality contains an instance of stepper that executes its own instance of the function do_work(). This scheme does create an interesting synchronization problem that must be solved. When the program flow was being coordinated on the head node, the GID of each component was known. However, when we distribute the program flow, each partition has no notion of the GID of its neighbor if the next partition is on another locality. In order to make the GIDs of neighboring partitions visible to each other, we created two buffers to store the GIDs of the remote neighboring partitions on the left and right respectively. These buffers are filled by sending the GID of newly created edge partitions to the right and left buffers of the neighboring localities.

In order to finish the simulation, the solution vectors named result are then gathered together on locality 0 and added into a vector of spaces overall_result using the HPX functions gather_id and gather_here.

Example 8 completes this example series, which takes the serial code of example 1 and incrementally morphs it into a fully distributed parallel code. This evolution was guided by the simple principles of futurization, the knowledge of grainsize, and utilization of components. Applying these techniques easily facilitates the scalable parallelization of most applications.

Serializing user-defined types#

In order to facilitate the sending and receiving of complex datatypes HPX provides a serialization abstraction.

Just like boost, hpx allows users to serialize user-defined types by either providing the serializer as a member function or defining the serialization as a free function.

Unlike Boost HPX doesn’t acknowledge second unsigned int parameter, it is solely there to preserve API compatibility with Boost Serialization

This is tutorial was heavily inspired by Boost’s serialization concepts.

Setup#

The source code for this example can be found here: custom_serialization.cpp.

To compile this program, go to your HPX build directory (see Building HPX for information on configuring and building HPX) and enter:

$ make examples.quickstart.custom_serialization

To run the program type:

$ ./bin/custom_serialization

This should print:

Rectangle(Point(x=0,y=0),Point(x=0,y=5))
gravity.g = 9.81%
Serialization Requirements#

In order to serialize objects in HPX, at least one of the following criteria must be met:

In the case of default constructible objects:

  • The object is an empty type.

  • Has a serialization function as shown in this tutorial.

  • All members are accessible publicly and they can be used in structured binding contexts.

Otherwise:

  • They need to have special serialization support.

Member function serialization#
struct point_member_serialization
{
    int x{0};
    int y{0};

    // Required when defining the serialization function as private
    // In this case it isn't
    // Provides serialization access to HPX
    friend class hpx::serialization::access;

    // Second argument exists solely for compatibility with boost serialize
    // it is NOT processed by HPX in any way.
    template <typename Archive>
    void serialize(Archive& ar, const unsigned int)
    {
        // clang-format off
        ar & x & y;
        // clang-format on
    }
};

// Allow bitwise serialization
HPX_IS_BITWISE_SERIALIZABLE(point_member_serialization)

Notice that point_member_serialization is defined as bitwise serializable (see Bitwise serialization for bitwise copyable data for more details). HPX is also able to recursively serialize composite classes and structs given that its members are serializable.

struct rectangle_member_serialization
{
    point_member_serialization top_left;
    point_member_serialization lower_right;

    template <typename Archive>
    void serialize(Archive& ar, const unsigned int)
    {
        // clang-format off
        ar & top_left & lower_right;
        // clang-format on
    }
};
Free function serialization#

In order to decouple your models from HPX, HPX also allows for the definition of free function serializers.

struct rectangle_free
{
    point_member_serialization top_left;
    point_member_serialization lower_right;
};

template <typename Archive>
void serialize(Archive& ar, rectangle_free& pt, const unsigned int)
{
    // clang-format off
    ar & pt.lower_right & pt.top_left;
    // clang-format on
}

Even if you can’t modify a class to befriend it, you can still be able to serialize your class provided that your class is default constructable and you are able to reconstruct it yourself.

class point_class
{
public:
    point_class(int x, int y)
      : x(x)
      , y(y)
    {
    }

    point_class() = default;

    [[nodiscard]] int get_x() const noexcept
    {
        return x;
    }

    [[nodiscard]] int get_y() const noexcept
    {
        return y;
    }

private:
    int x;
    int y;
};

template <typename Archive>
void load(Archive& ar, point_class& pt, const unsigned int)
{
    int x, y;
    ar >> x >> y;
    pt = point_class(x, y);
}

template <typename Archive>
void save(Archive& ar, point_class const& pt, const unsigned int)
{
    ar << pt.get_x() << pt.get_y();
}

// This tells HPX that you have spilt your serialize function into
// load and save
HPX_SERIALIZATION_SPLIT_FREE(point_class)
Serializing non default constructable classes#

Some classes don’t provide any default constructor.

class planet_weight_calculator
{
public:
    explicit planet_weight_calculator(double g)
      : g(g)
    {
    }

    template <class Archive>
    friend void save_construct_data(
        Archive&, planet_weight_calculator const*, unsigned int);

    [[nodiscard]] double get_g() const
    {
        return g;
    }

private:
    // Provides serialization access to HPX
    friend class hpx::serialization::access;
    template <class Archive>
    void serialize(Archive&, const unsigned int)
    {
        // Serialization will be done in the save_construct_data
        // Still needs to be defined
    }

    double g;
};

In this case you have to define a save_construct_data and load_construct_data in which you do the serialization yourself.

template <class Archive>
inline void save_construct_data(Archive& ar,
    planet_weight_calculator const* weight_calc, const unsigned int)
{
    ar << weight_calc->g;    // Do all of your serialization here
}

template <class Archive>
inline void load_construct_data(
    Archive& ar, planet_weight_calculator* weight_calc, const unsigned int)
{
    double g;
    ar >> g;

    // ::new(ptr) construct new object at given address
    hpx::construct_at(weight_calc, g);
}
Bitwise serialization for bitwise copyable data#

When sending non arithmetic types not defined by std::is_arithmetic, HPX has to (de)serialize each object separately. However, if the class you are trying to send classes consists only of bitwise copyable datatypes, you may mark your class as such. Then HPX will serialize your object bitwise instead of element wise. This has enormous benefits, especially when sending a vector/array of your class. To define your class as such you need to call HPX_IS_BITWISE_SERIALIZABLE(T) with your desired custom class.

struct point_member_serialization
{
    int x{0};
    int y{0};

    // Required when defining the serialization function as private
    // In this case it isn't
    // Provides serialization access to HPX
    friend class hpx::serialization::access;

    // Second argument exists solely for compatibility with boost serialize
    // it is NOT processed by HPX in any way.
    template <typename Archive>
    void serialize(Archive& ar, const unsigned int)
    {
        // clang-format off
        ar & x & y;
        // clang-format on
    }
};

// Allow bitwise serialization
HPX_IS_BITWISE_SERIALIZABLE(point_member_serialization)

Manual#

The manual is your comprehensive guide to HPX. It contains detailed information on how to build and use HPX in different scenarios.

Prerequisites#

Supported platforms#

At this time, HPX supports the following platforms. Other platforms may work, but we do not test HPX with other platforms, so please be warned.

Table 1 Supported Platforms for HPX#

Name

Minimum Version

Architectures

Linux

2.6

x86-32, x86-64, k1om

BlueGeneQ

V1R2M0

PowerPC A2

Windows

Any Windows system

x86-32, x86-64

Mac OSX

Any OSX system

x86-64

ARM

Any ARM system

Any architecture

RISC-V

Any RISC-V system

Any architecture

Supported compilers#

The table below shows the supported compilers for HPX.

Table 2 Supported Compilers for HPX#

Name

Minimum Version

Latest tested

GNU Compiler Collection (g++)

12.0

15.0

clang: a C language family frontend for LLVM

16.0

20.0

Visual C++ (x64)

2022

2022

Software and libraries#

The table below presents all the necessary prerequisites for building HPX.

Table 3 Software prerequisites for HPX#

Name

Minimum Version

Latest tested

Build System

CMake

3.20

4.1

Required Libraries

Boost

1.71.0

1.88.0

Portable Hardware Locality (HWLOC)

1.5

2.4

The most important dependencies are Boost and Portable Hardware Locality (HWLOC). The installation of Boost is described in detail in Boost’s Getting Started document. A recent version of hwloc is required in order to support thread pinning and NUMA awareness and can be found in Hwloc Downloads.

HPX is written in 99.99% Standard C++ (the remaining 0.01% is platform specific assembly code). As such, HPX is compilable with almost any standards compliant C++ compiler. The code base takes advantage of C++ language and standard library features when available.

Note

When building Boost using gcc, please note that it is required to specify a cxxflags=-std=c++20 command line argument to b2 (bjam).

Note

In most configurations, HPX depends only on header-only Boost. Boost.Filesystem is required if the standard library does not support filesystem. The following are not needed by default, but are required in certain configurations: Boost.Chrono, Boost.DateTime, Boost.Log, Boost.LogSetup, Boost.Regex, and Boost.Thread.

Depending on the options you chose while building and installing HPX, you will find that HPX may depend on several other libraries such as those listed below.

Note

In order to use a high speed parcelport, we currently recommend configuring HPX to use MPI so that MPI can be used for communication between different localities. Please set the CMake variable MPI_CXX_COMPILER to your MPI C++ compiler wrapper if not detected automatically.

Table 4 Optional software prerequisites for HPX#

Name

Minimum version

google-perftools

1.7.1

jemalloc

2.1.0

mi-malloc

1.0.0

Performance Application Programming Interface (PAPI)

Getting HPX#

Download a tarball of the latest release from HPX Downloads and unpack it or clone the repository directly using git:

$ git clone https://github.com/STEllAR-GROUP/hpx.git

It is also recommended that you check out the latest stable tag:

$ cd hpx
$ git checkout v2.0.0

Building HPX#

Basic information#

The build system for HPX is based on CMake, a cross-platform build-generator tool which is not responsible for building the project but rather generates the files needed by your build tool (GNU make, Visual Studio, etc.) for building HPX. If CMake is not already installed in your system, you can download it and install it here: CMake Downloads.

Once CMake has been run, the build process can be started. The build process consists of the following parts:

  • The HPX core libraries (target core): This forms the basic set of HPX libraries.

  • HPX Examples (target examples): This target is enabled by default and builds all HPX examples (disable by setting HPX_WITH_EXAMPLES:BOOL=Off). HPX examples are part of the all target and are included in the installation if enabled.

  • HPX Tests (target tests): This target builds the HPX test suite and is enabled by default (disable by setting HPX_WITH_TESTS:BOOL =Off). They are not built by the all target and have to be built separately.

  • HPX Documentation (target docs): This target builds the documentation, and is not enabled by default (enable by setting HPX_WITH_DOCUMENTATION:BOOL=On. For more information see Documentation.

The HPX build process is highly configurable through CMake, and various CMake variables influence the build process. A list with the most important CMake variables can be found in the section that follows, while the complete list of available CMake variables is in CMake options. These variables can be used to refine the recipes that can be found at Platform specific build recipes, a section that shows some basic steps on how to build HPX for a specific platform.

In order to use HPX, only the core libraries are required. In order to use the optional libraries, you need to specify them as link dependencies in your build (See Creating HPX projects).

Most important CMake options#

While building HPX, you are provided with multiple CMake options which correspond to different configurations. Below, there is a set of the most important and frequently used CMake options.

HPX_WITH_MALLOC#

Use a custom allocator. Using a custom allocator tuned for multithreaded applications is very important for the performance of HPX applications. When debugging applications, it’s useful to set this to system, as custom allocators can hide some memory-related bugs. Note that setting this to something other than system requires an external dependency.

HPX_WITH_CUDA#

Enable support for CUDA. Use CMAKE_CUDA_COMPILER to set the CUDA compiler. This is a standard CMake variable, like CMAKE_CXX_COMPILER.

HPX_WITH_PARCELPORT_MPI#

Enable the MPI parcelport. This enables the use of MPI for the networking operations in the HPX runtime. The default value is OFF because it’s not available on all systems and/or requires another dependency. However, it is the recommended parcelport.

HPX_WITH_PARCELPORT_TCP#

Enable the TCP parcelport. Enables the use of TCP for networking in the runtime. The default value is ON. However, it’s only recommended for debugging purposes, as it is slower than the MPI parcelport.

HPX_WITH_PARCELPORT_LCI#

Enable the LCI parcelport. This enables the use of LCI for the networking operations in the HPX runtime. The default value is OFF because it’s not available on all systems and/or requires another dependency. However, this experimental parcelport may provide better performance than the MPI parcelport. Please refer to Using the LCI parcelport for more information about the LCI parcelport.

HPX_WITH_APEX#

Enable APEX integration. APEX can be used to profile HPX applications. In particular, it provides information about individual tasks in the HPX runtime.

HPX_WITH_GENERIC_CONTEXT_COROUTINES#

Enable Boost. Context for task context switching. It must be enabled for non-x86 architectures such as ARM and Power.

HPX_WITH_MAX_CPU_COUNT#

Set the maximum CPU count supported by HPX. The default value is 64, and should be set to a number at least as high as the number of cores on a system including virtual cores such as hyperthreads.

HPX_WITH_CXX_STANDARD#

Set a specific C++ standard version e.g. HPX_WITH_CXX_STANDARD=23. The default and minimum value is 20. Possible values are 20, 23, or 26.

HPX_WITH_EXAMPLES#

Build examples.

HPX_WITH_TESTS#

Build tests.

For a complete list of available CMake variables that influence the build of HPX, see CMake options.

Build types#

CMake can be configured to generate project files suitable for builds that have enabled debugging support or for an optimized build (without debugging support). The CMake variable used to set the build type is CMAKE_BUILD_TYPE (for more information see the CMake Documentation). Available build types are:

  • Debug: Full debug symbols are available as well as additional assertions to help debugging. To enable the debug build type for the HPX API, the C++ Macro HPX_DEBUG is defined.

  • RelWithDebInfo: Release build with debugging symbols. This is most useful for profiling applications

  • Release: Release build. This disables assertions and enables default compiler optimizations.

  • RelMinSize: Release build with optimizations for small binary sizes.

Important

We currently don’t guarantee ABI compatibility between Debug and Release builds. Please make sure that applications built against HPX use the same build type as you used to build HPX. For CMake builds, this means that the CMAKE_BUILD_TYPE variables have to match and for projects not using CMake, the HPX_DEBUG macro has to be set in debug mode.

Platform specific build recipes#
Unix variants#

Once you have the source code and the dependencies and assuming all your dependencies are in paths known to CMake, the following gets you started:

  1. First, set up a separate build directory to configure the project:

    $ mkdir build && cd build
    
  2. To configure the project you have the following options:

    • To build the core HPX libraries and examples, and install them to your chosen location (recommended):

    $ cmake -DCMAKE_INSTALL_PREFIX=/install/path ..
    

    Tip

    If you want to change CMake variables for your build, it is usually a good idea to start with a clean build directory to avoid configuration problems. It is especially important that you use a clean build directory when changing between Release and Debug modes.

    • To install HPX to the default system folders, simply leave out the CMAKE_INSTALL_PREFIX option:

    $ cmake ..
    
    • If your dependencies are in custom locations, you may need to tell CMake where to find them by passing one or more options to CMake as shown below:

    $ cmake -DBoost_ROOT=/path/to/boost
          -DHwloc_ROOT=/path/to/hwloc
          -DTcmalloc_ROOT=/path/to/tcmalloc
          -DJemalloc_ROOT=/path/to/jemalloc
          [other CMake variable definitions]
          /path/to/source/tree
    

    For instance:

    $ cmake -DBoost_ROOT=~/packages/boost -DHwloc_ROOT=/packages/hwloc -DCMAKE_INSTALL_PREFIX=~/packages/hpx ~/downloads/hpx_1.5.1
    
    • If you want to try HPX without using a custom allocator pass -DHPX_WITH_MALLOC=system to CMake:

    $ cmake -DCMAKE_INSTALL_PREFIX=/install/path -DHPX_WITH_MALLOC=system ..
    

    Note

    Please pay special attention to the section about HPX_WITH_MALLOC:STRING as this is crucial for getting decent performance.

    Important

    If you are building HPX for a system with more than 64 processing units, you must change the CMake variable HPX_WITH_MAX_CPU_COUNT (to a value at least as big as the number of (virtual) cores on your system). Note that the default value is 64.

    Caution

    Compiling and linking HPX needs a considerable amount of memory. It is advisable that at least 2 GB of memory per parallel process is available.

  3. Once the configuration is complete, to build the project you run:

$ cmake --build . --target install
Windows#

Note

The following build recipes are mostly user-contributed and may be outdated. We always welcome updated and new build recipes.

To build HPX under Windows 10 x64 with Visual Studio 2015:

  • Download the CMake V3.19 installer (or latest version) from here

  • Download the hwloc V1.11.0 (or the latest version) from here and unpack it.

  • Download the latest Boost libraries from here and unpack them.

  • Build the Boost DLLs and LIBs by using these commands from Command Line (or PowerShell). Open CMD/PowerShell inside the Boost dir and type in:

    .\bootstrap.bat
    

    This batch file will set up everything needed to create a successful build. Now execute:

    .\b2.exe link=shared variant=release,debug architecture=x86 address-model=64 threading=multi --build-type=complete install
    

    This command will start a (very long) build of all available Boost libraries. Please, be patient.

  • Open CMake-GUI.exe and set up your source directory (input field ‘Where is the source code’) to the base directory of the source code you downloaded from HPX’s GitHub pages. Here’s an example of CMake path settings, which point to the Documents/GitHub/hpx folder:

    _images/cmake_settings1.png

    Fig. 3 Example CMake path settings.#

    Inside ‘Where is the source-code’ enter the base directory of your HPX source directory (do not enter the “src” sub-directory!). Inside ‘Where to build the binaries’ you should put in the path where all the building processes will happen. This is important because the building machinery will do an “out-of-tree” build. CMake will not touch or change the original source files in any way. Instead, it will generate Visual Studio Solution Files, which will build HPX packages out of the HPX source tree.

  • Set new configuration variables (in CMake, not in Windows environment): Boost_ROOT, Hwloc_ROOT, Asio_ROOT, CMAKE_INSTALL_PREFIX. The meaning of these variables is as follows:

    • Boost_ROOT the HPX root directory of the unpacked Boost headers/cpp files.

    • Hwloc_ROOT the HPX root directory of the unpacked Portable Hardware Locality files.

    • Asio_ROOT the HPX root directory of the unpacked ASIO files. Alternatively use HPX_WITH_FETCH_ASIO with value True.

    • CMAKE_INSTALL_PREFIX the HPX root directory where the future builds of HPX should be installed.

      Note

      HPX is a very large software collection, so it is not recommended to use the default C:\Program Files\hpx. Many users may prefer to use simpler paths without whitespace, like C:\bin\hpx or D:\bin\hpx etc.

    To insert new env-vars click on “Add Entry” and then insert the name inside “Name”, select PATH as Type and put the path-name in the “Path” text field. Repeat this for the first three variables.

    This is how variable insertion will look:

    _images/cmake_settings2.png

    Fig. 4 Example CMake adding entry.#

    Alternatively, users could provide Boost_LIBRARYDIR instead of Boost_ROOT; the difference is that Boost_LIBRARYDIR should point to the subdirectory inside Boost root where all the compiled DLLs/LIBs are. For example, Boost_LIBRARYDIR may point to the bin.v2 subdirectory under the Boost rootdir. It is important to keep the meanings of these two variables separated from each other: Boost_DIR points to the ROOT folder of the Boost library. Boost_LIBRARYDIR points to the subdir inside the Boost root folder where the compiled binaries are.

  • Click the ‘Configure’ button of CMake-GUI. You will be immediately presented with a small window where you can select the C++ compiler to be used within Visual Studio. This has been tested using the latest v14 (a.k.a C++ 2015) but older versions should be sufficient too. Make sure to select the 64Bit compiler.

  • After the generate process has finished successfully, click the ‘Generate’ button. Now, CMake will put new VS Solution files into the BUILD folder you selected at the beginning.

  • Open Visual Studio and load the HPX.sln from your build folder.

  • Go to CMakePredefinedTargets and build the INSTALL project:

    _images/vs_targets_install.png

    Fig. 5 Visual Studio INSTALL target.#

    It will take some time to compile everything, and in the end you should see an output similar to this one:

    _images/vs_build_output.png

    Fig. 6 Visual Studio build output.#

CMake options#

In order to configure HPX, you can set a variety of options to allow CMake to generate your specific makefiles/project files. A list of the most important CMake options can be found in Most important CMake options, while this section includes the comprehensive list.

Variables that influence how HPX is built#

The options are split into these categories:

Generic options#
HPX_WITH_AUTOMATIC_SERIALIZATION_REGISTRATION:BOOL#

Use automatic serialization registration for actions and functions. This affects compatibility between HPX applications compiled with different compilers (default ON)

HPX_WITH_BENCHMARK_SCRIPTS_PATH:PATH#

Directory to place batch scripts in

HPX_WITH_BUILD_BINARY_PACKAGE:BOOL#

Build HPX on the build infrastructure on any LINUX distribution (default: OFF).

HPX_WITH_CHECK_MODULE_DEPENDENCIES:BOOL#

Verify that no modules are cross-referenced from a different module category (default: OFF)

HPX_WITH_COMPILER_WARNINGS:BOOL#

Enable compiler warnings (default: ON)

HPX_WITH_COMPILER_WARNINGS_AS_ERRORS:BOOL#

Turn compiler warnings into errors (default: OFF)

HPX_WITH_COMPRESSION_BZIP2:BOOL#

Enable bzip2 compression for parcel data (default: OFF).

HPX_WITH_COMPRESSION_SNAPPY:BOOL#

Enable snappy compression for parcel data (default: OFF).

HPX_WITH_COMPRESSION_ZLIB:BOOL#

Enable zlib compression for parcel data (default: OFF).

HPX_WITH_CUDA:BOOL#

Enable support for CUDA (default: OFF)

HPX_WITH_CXX_MODULES:BOOL#

Enable exposing C++20 modules (default: OFF).

HPX_WITH_CXX_STANDARD:STRING#

Set the C++ standard to use when compiling HPX itself. (default: 20)

HPX_WITH_DATAPAR:BOOL#

Enable data parallel algorithm support using Vc library (default: ON)

HPX_WITH_DATAPAR_BACKEND:STRING#

Define which vectorization library should be used. Options are: VC, EVE, STD_EXPERIMENTAL_SIMD, SVE; NONE

HPX_WITH_DATAPAR_VC_NO_LIBRARY:BOOL#

Don’t link with the Vc static library (default: OFF)

HPX_WITH_DEPRECATION_WARNINGS:BOOL#

Enable warnings for deprecated facilities (default: ON).

HPX_WITH_DISABLED_SIGNAL_EXCEPTION_HANDLERS:BOOL#

Disables the mechanism that produces debug output for caught signals and unhandled exceptions (default: OFF)

HPX_WITH_DYNAMIC_HPX_MAIN:BOOL#

Enable dynamic overload of system main() (Linux and Apple only, default: ON)

HPX_WITH_FAULT_TOLERANCE:BOOL#

Build HPX to tolerate failures of nodes, i.e. ignore errors in active communication channels (default: OFF)

HPX_WITH_FULL_RPATH:BOOL#

Build and link HPX libraries and executables with full RPATHs (default: ON)

HPX_WITH_GCC_VERSION_CHECK:BOOL#

Don’t ignore version reported by gcc (default: ON)

HPX_WITH_GENERIC_CONTEXT_COROUTINES:BOOL#

Use Boost.Context as the underlying coroutines context switch implementation.

HPX_WITH_HIDDEN_VISIBILITY:BOOL#

Use -fvisibility=hidden for builds on platforms which support it (default OFF)

HPX_WITH_HIP:BOOL#

Enable compilation with HIPCC (default: OFF)

HPX_WITH_HIPSYCL:BOOL#

Use hipsycl cmake integration (default: OFF)

HPX_WITH_IGNORE_COMPILER_COMPATIBILITY:BOOL#

Ignore compiler incompatibility in dependent projects (default: ON).

HPX_WITH_LOGGING:BOOL#

Build HPX with logging enabled (default: ON).

HPX_WITH_MALLOC:STRING#

Define which allocator should be linked in. Options are: system, tcmalloc, jemalloc, mimalloc, tbbmalloc, and custom (default is: tcmalloc)

HPX_WITH_MODULES_AS_STATIC_LIBRARIES:BOOL#

Compile HPX modules as STATIC (whole-archive) libraries instead of OBJECT libraries (Default: ON)

HPX_WITH_MODULE_COMPATIBILITY_HEADERS:BOOL#

Generate backwards-compatibility headers for HPX Modules (default: OFF)

HPX_WITH_NICE_THREADLEVEL:BOOL#

Set HPX worker threads to have high NICE level (may impact performance) (default: OFF)

HPX_WITH_PARCEL_COALESCING:BOOL#

Enable the parcel coalescing plugin (default: ON).

HPX_WITH_PKGCONFIG:BOOL#

Enable generation of pkgconfig files (default: ON on Linux without CUDA/HIP, otherwise OFF)

HPX_WITH_PRECOMPILED_HEADERS:BOOL#

Enable precompiled headers for certain build targets (experimental) (default OFF)

HPX_WITH_RUN_MAIN_EVERYWHERE:BOOL#

Run hpx_main by default on all localities (default: OFF, deprecated, will be removed).

HPX_WITH_STACKOVERFLOW_DETECTION:BOOL#

Enable stackoverflow detection for HPX threads/coroutines (default: OFF, debug: ON).

HPX_WITH_STATIC_LINKING:BOOL#

Compile HPX statically linked libraries (Default: OFF)

HPX_WITH_SUPPORT_NO_UNIQUE_ADDRESS_ATTRIBUTE:BOOL#

Enable the use of the [[no_unique_address]] attribute (default: ON)

HPX_WITH_SYCL:BOOL#

Enable support for Sycl (default: OFF)

HPX_WITH_SYCL_FLAGS:STRING#

Sycl compile flags for selecting specific targets (default: empty)

HPX_WITH_UNITY_BUILD:BOOL#

Enable unity build for certain build targets (default OFF)

HPX_WITH_VIM_YCM:BOOL#

Generate HPX completion file for VIM YouCompleteMe plugin

HPX_WITH_ZERO_COPY_SERIALIZATION_THRESHOLD:STRING#

The threshold in bytes to when perform zero copy optimizations (default: 8192)

Build Targets options#
HPX_WITH_ASIO_TAG:STRING#

Asio repository tag or branch

HPX_WITH_COMPILE_ONLY_TESTS:BOOL#

Create build system support for compile time only HPX tests (default ON)

HPX_WITH_DISTRIBUTED_RUNTIME:BOOL#

Enable the distributed runtime (default: ON). Turning off the distributed runtime completely disallows the creation and use of components and actions. Turning this option off is experimental!

HPX_WITH_DOCUMENTATION:BOOL#

Build the HPX documentation (default OFF).

HPX_WITH_DOCUMENTATION_OUTPUT_FORMATS:STRING#

List of documentation output formats to generate. Valid options are html;singlehtml;latexpdf;man. Multiple values can be separated with semicolons. (default html).

HPX_WITH_EXAMPLES:BOOL#

Build the HPX examples (default ON)

HPX_WITH_EXAMPLES_HDF5:BOOL#

Enable examples requiring HDF5 support (default: OFF).

HPX_WITH_EXAMPLES_OPENMP:BOOL#

Enable examples requiring OpenMP support (default: OFF).

HPX_WITH_EXAMPLES_QT4:BOOL#

Enable examples requiring Qt4 support (default: OFF).

HPX_WITH_EXAMPLES_QTHREADS:BOOL#

Enable examples requiring QThreads support (default: OFF).

HPX_WITH_EXAMPLES_TBB:BOOL#

Enable examples requiring TBB support (default: OFF).

HPX_WITH_EXECUTABLE_PREFIX:STRING#

Executable prefix (default none), ‘hpx_’ useful for system install.

HPX_WITH_FAIL_COMPILE_TESTS:BOOL#

Create build system support for fail compile HPX tests (default ON)

HPX_WITH_FETCH_APEX:BOOL#

Use FetchContent to fetch APEX. By default an installed APEX will be used. (default: OFF)

HPX_WITH_FETCH_ASIO:BOOL#

Use FetchContent to fetch Asio. By default an installed Asio will be used. (default: OFF)

HPX_WITH_FETCH_BOOST:BOOL#

Use FetchContent to fetch Boost. By default an installed Boost will be used. (default: OFF)

HPX_WITH_FETCH_GASNET:BOOL#

Use FetchContent to fetch GASNET. By default an installed GASNET will be used. (default: OFF).

HPX_WITH_FETCH_HWLOC:BOOL#

Use FetchContent to fetch Hwloc. By default an installed Hwloc will be used. (default: OFF)

HPX_WITH_FETCH_LCI:BOOL#

Use FetchContent to fetch LCI. By default an installed LCI will be used. (default: OFF)

HPX_WITH_IO_COUNTERS:BOOL#

Enable IO counters (default: ON)

HPX_WITH_LCI_BOOTSTRAP_MPI:BOOL#

Configure the autofetched LCI with mpi bootstrap support (default: OFF)

HPX_WITH_LCI_TAG:STRING#

LCI repository tag or branch

HPX_WITH_NANOBENCH:BOOL#

Use Nanobench for performance tests. Nanobench will be fetched using FetchContent (default: OFF)

Number of Parallel link jobs while building hpx (only for Ninja as generator) (default 2)

HPX_WITH_TESTS:BOOL#

Build the HPX tests (default ON)

HPX_WITH_TESTS_BENCHMARKS:BOOL#

Build HPX benchmark tests (default: ON)

HPX_WITH_TESTS_EXAMPLES:BOOL#

Add HPX examples as tests (default: ON)

HPX_WITH_TESTS_EXTERNAL_BUILD:BOOL#

Build external cmake build tests (default: ON)

HPX_WITH_TESTS_HEADERS:BOOL#

Build HPX header tests (default: OFF)

HPX_WITH_TESTS_REGRESSIONS:BOOL#

Build HPX regression tests (default: ON)

HPX_WITH_TESTS_UNIT:BOOL#

Build HPX unit tests (default: ON)

HPX_WITH_THRUST:BOOL#

Enable support for NVIDIA Thrust integration (default: ON when CUDA is enabled, OFF otherwise)

HPX_WITH_TOOLS:BOOL#

Build HPX tools (default: OFF)

Thread Manager options#
HPX_COROUTINES_WITH_SWAP_CONTEXT_EMULATION:BOOL#

Emulate SwapContext API for coroutines (Windows only, default: OFF)

HPX_COROUTINES_WITH_THREAD_SCHEDULE_HINT_RUNS_AS_CHILD:BOOL#

Futures attempt to run associated threads directly if those have not been started (default: OFF)

HPX_WITH_COROUTINE_COUNTERS:BOOL#

Enable keeping track of coroutine creation and rebind counts (default: OFF)

HPX_WITH_IO_POOL:BOOL#

Disable internal IO thread pool, do not change if not absolutely necessary (default: ON)

HPX_WITH_MAX_CPU_COUNT:STRING#

HPX applications will not use more that this number of OS-Threads (empty string means dynamic) (default: “”)

HPX_WITH_MAX_NUMA_DOMAIN_COUNT:STRING#

HPX applications will not run on machines with more NUMA domains (default: 8)

HPX_WITH_SCHEDULER_LOCAL_STORAGE:BOOL#

Enable scheduler local storage for all HPX schedulers (default: OFF)

HPX_WITH_SPINLOCK_DEADLOCK_DETECTION:BOOL#

Enable spinlock deadlock detection (default: OFF)

HPX_WITH_SPINLOCK_POOL_NUM:STRING#

Number of elements a spinlock pool manages (default: 128)

HPX_WITH_STACKTRACES:BOOL#

Attach backtraces to HPX exceptions (default: ON)

HPX_WITH_STACKTRACES_DEMANGLE_SYMBOLS:BOOL#

Thread stack back trace symbols will be demangled (default: ON)

HPX_WITH_STACKTRACES_STATIC_SYMBOLS:BOOL#

Thread stack back trace will resolve static symbols (default: OFF)

HPX_WITH_THREAD_BACKTRACE_DEPTH:STRING#

Thread stack back trace depth being captured (default: 20)

HPX_WITH_THREAD_BACKTRACE_ON_SUSPENSION:BOOL#

Enable thread stack back trace being captured on suspension (default: OFF)

HPX_WITH_THREAD_CREATION_AND_CLEANUP_RATES:BOOL#

Enable measuring thread creation and cleanup times (default: OFF)

HPX_WITH_THREAD_CUMULATIVE_COUNTS:BOOL#

Enable keeping track of cumulative thread counts in the schedulers (default: ON)

HPX_WITH_THREAD_IDLE_RATES:BOOL#

Enable measuring the percentage of overhead times spent in the scheduler (default: OFF)

HPX_WITH_THREAD_LOCAL_STORAGE:BOOL#

Enable thread local storage for all HPX threads (default: OFF)

HPX_WITH_THREAD_MANAGER_IDLE_BACKOFF:BOOL#

HPX scheduler threads do exponential backoff on idle queues (default: ON)

HPX_WITH_THREAD_QUEUE_WAITTIME:BOOL#

Enable collecting queue wait times for threads (default: OFF)

HPX_WITH_THREAD_STACK_MMAP:BOOL#

Use mmap for stack allocation on appropriate platforms

HPX_WITH_THREAD_STEALING_COUNTS:BOOL#

Enable keeping track of counts of thread stealing incidents in the schedulers (default: OFF)

HPX_WITH_THREAD_TARGET_ADDRESS:BOOL#

Enable storing target address in thread for NUMA awareness (default: OFF)

HPX_WITH_TIMER_POOL:BOOL#

Disable internal timer thread pool, do not change if not absolutely necessary (default: ON)

HPX_WITH_WORK_REQUESTING_SCHEDULERS:BOOL#

Enable work requesting scheduler (default: ON)

AGAS options#
HPX_WITH_AGAS_DUMP_REFCNT_ENTRIES:BOOL#

Enable dumps of the AGAS refcnt tables to logs (default: OFF)

Parcelport options#
HPX_WITH_NETWORKING:BOOL#

Enable support for networking and multi-node runs (default: ON)

HPX_WITH_PARCELPORT_ACTION_COUNTERS:BOOL#

Enable performance counters reporting parcelport statistics on a per-action basis.

HPX_WITH_PARCELPORT_COUNTERS:BOOL#

Enable performance counters reporting parcelport statistics.

HPX_WITH_PARCELPORT_GASNET:BOOL#

Enable the GASNET based parcelport.

HPX_WITH_PARCELPORT_LCI:BOOL#

Enable the LCI based parcelport.

HPX_WITH_PARCELPORT_LCI_LOG:STRING#

Enable the LCI-parcelport-specific logger

HPX_WITH_PARCELPORT_LCI_PCOUNTER:STRING#

Enable the LCI-parcelport-specific performance counter

HPX_WITH_PARCELPORT_LIBFABRIC:BOOL#

Enable the libfabric based parcelport. This is currently an experimental feature

HPX_WITH_PARCELPORT_MPI:BOOL#

Enable the MPI based parcelport.

HPX_WITH_PARCELPORT_TCP:BOOL#

Enable the TCP based parcelport.

HPX_WITH_PARCEL_PROFILING:BOOL#

Enable profiling data for parcels

Profiling options#
HPX_WITH_APEX:BOOL#

Enable APEX instrumentation support.

HPX_WITH_ITTNOTIFY:BOOL#

Enable Amplifier (ITT) instrumentation support.

HPX_WITH_PAPI:BOOL#

Enable the PAPI based performance counter.

Debugging options#
HPX_WITH_ASSERTS_AS_CONTRACT_ASSERTS:BOOL#

Swap hpx_assert with hpx_contract_assert

HPX_WITH_ATTACH_DEBUGGER_ON_TEST_FAILURE:BOOL#

Break the debugger if a test has failed (default: OFF)

HPX_WITH_CONTRACTS:BOOL#

Enable C++ contracts support in HPX

HPX_WITH_PARALLEL_TESTS_BIND_NONE:BOOL#

Pass –hpx:bind=none to tests that may run in parallel (cmake -j flag) (default: OFF)

HPX_WITH_SANITIZERS:BOOL#

Configure with sanitizer instrumentation support.

HPX_WITH_TESTS_COMMAND_LINE:STRING#

Add given command line options to all tests run

HPX_WITH_TESTS_DEBUG_LOG:BOOL#

Turn on debug logs (–hpx:debug-hpx-log) for tests (default: OFF)

HPX_WITH_TESTS_DEBUG_LOG_DESTINATION:STRING#

Destination for test debug logs (default: cout)

HPX_WITH_TESTS_MAX_THREADS_PER_LOCALITY:STRING#

Maximum number of threads to use for tests (default: 0, use the number of threads specified by the test)

HPX_WITH_THREAD_DEBUG_INFO:BOOL#

Enable thread debugging information (default: OFF, implicitly enabled in debug builds)

HPX_WITH_THREAD_DESCRIPTION_FULL:BOOL#

Use function address for thread description (default: OFF)

HPX_WITH_THREAD_GUARD_PAGE:BOOL#

Enable thread guard page (default: ON)

HPX_WITH_VALGRIND:BOOL#

Enable Valgrind instrumentation support.

HPX_WITH_VERIFY_LOCKS:BOOL#

Enable lock verification code (default: OFF, enabled in debug builds)

HPX_WITH_VERIFY_LOCKS_BACKTRACE:BOOL#

Enable thread stack back trace being captured on lock registration (to be used in combination with HPX_WITH_VERIFY_LOCKS=ON, default: OFF)

Modules options#
HPX_ALLOCATOR_SUPPORT_WITH_CACHING:BOOL#

Enable caching allocator. (default: ON)

HPX_COMMAND_LINE_HANDLING_LOCAL_WITH_JSON_CONFIGURATION_FILES:BOOL#

Enable reading JSON formatted configuration files on the command line.

(default: On)

HPX_DATASTRUCTURES_WITH_ADAPT_STD_TUPLE:BOOL#

Enable compatibility of hpx::get with std::tuple. (default: ON)

HPX_DATASTRUCTURES_WITH_ADAPT_STD_VARIANT:BOOL#

Enable compatibility of hpx::get with std::variant.

(default: OFF)

HPX_FILESYSTEM_WITH_BOOST_FILESYSTEM_COMPATIBILITY:BOOL#

Enable Boost.FileSystem compatibility. (default: OFF)

HPX_FUNCTIONAL_WITH_BOOST_PLACEHOLDERS:BOOL#

Enable support for Boost placeholder types. (default: OFF)

HPX_ITERATOR_SUPPORT_WITH_BOOST_ITERATOR_TRAVERSAL_TAG_COMPATIBILITY:BOOL#

Enable Boost.Iterator traversal tag compatibility. (default: OFF)

HPX_LOGGING_WITH_SEPARATE_DESTINATIONS:BOOL#

Enable separate logging channels for AGAS, timing, and parcel transport. (default: ON)

HPX_SERIALIZATION_WITH_ALLOW_CONST_TUPLE_MEMBERS:BOOL#

Enable serializing std::tuple with const members. (default: OFF)

HPX_SERIALIZATION_WITH_ALLOW_RAW_POINTER_SERIALIZATION:BOOL#

Enable serializing raw pointers. (default: OFF)

HPX_SERIALIZATION_WITH_ALL_TYPES_ARE_BITWISE_SERIALIZABLE:BOOL#

Assume all types are bitwise serializable. (default: OFF)

HPX_SERIALIZATION_WITH_BOOST_TYPES:BOOL#

Enable serialization of certain Boost types. (default: OFF)

HPX_SERIALIZATION_WITH_SUPPORTS_ENDIANESS:BOOL#

Support endian conversion on inout and output archives. (default: OFF)

HPX_TOPOLOGY_WITH_ADDITIONAL_HWLOC_TESTING:BOOL#

Enable HWLOC filtering that makes it report no cores, this is purely an

option supporting better testing - do not enable under normal circumstances. (default: OFF)

HPX_WITH_POWER_COUNTER:BOOL#

Enable use of performance counters based on pwr library (default: OFF)

Additional tools and libraries used by HPX#

Here is a list of additional libraries and tools that are either optionally supported by the build system or are optionally required for certain examples or tests. These libraries and tools can be detected by the HPX build system.

Each of the tools or libraries listed here will be automatically detected if they are installed in some standard location. If a tool or library is installed in a different location, you can specify its base directory by appending _ROOT to the variable name as listed below. For instance, to configure a custom directory for Boost, specify Boost_ROOT=/custom/boost/root.

Boost_ROOT:PATH#

Specifies where to look for the Boost installation to be used for compiling HPX. Set this if CMake is not able to locate a suitable version of Boost. The directory specified here can be either the root of an installed Boost distribution or the directory where you unpacked and built Boost without installing it (with staged libraries).

Hwloc_ROOT:PATH#

Specifies where to look for the hwloc library. Set this if CMake is not able to locate a suitable version of hwloc. Hwloc provides platform- independent support for extracting information about the used hardware architecture (number of cores, number of NUMA domains, hyperthreading, etc.). HPX utilizes this information if available.

Papi_ROOT:PATH#

Specifies where to look for the PAPI library. The PAPI library is needed to compile a special component exposing PAPI hardware events and counters as HPX performance counters. This is not available on the Windows platform.

Amplifier_ROOT:PATH#

Specifies where to look for one of the tools of the Intel Parallel Studio product, either Intel Amplifier or Intel Inspector. This should be set if the CMake variable HPX_USE_ITT_NOTIFY is set to ON. Enabling ITT support in HPX will integrate any application with the mentioned Intel tools, which customizes the generated information for your application and improves the generated diagnostics.

In addition, some of the examples may need the following variables:

Hdf5_ROOT:PATH#

Specifies where to look for the Hierarchical Data Format V5 (HDF5) include files and libraries.

Migration guide#

The Migration Guide serves as a valuable resource for developers seeking to transition their parallel computing applications from different APIs (i.e. OpenMP, Intel Threading Building Blocks (TBB), MPI) to HPX. HPX, an advanced C++ library, offers a versatile and high-performance platform for parallel and distributed computing, providing a wide range of features and capabilities. This guide aims to assist developers in understanding the key differences between different APIs and HPX, and it provides step-by-step instructions for converting code to HPX code effectively.

Some general steps that can be used to migrate code to HPX code are the following:

  1. Install HPX using the Quick start guide.

  2. Include the HPX header files:

    Add the necessary header files for HPX at the beginning of your code, such as:

    #include <hpx/init.hpp>
    
  3. Replace your code with HPX code using the guide that follows.

  4. Use HPX-specific features and APIs:

    HPX provides additional features and APIs that can be used to take advantage of the library’s capabilities. For example, you can use the HPX asynchronous execution to express fine-grained tasks and dependencies, or utilize HPX’s distributed computing features for distributed memory systems.

  5. Compile and run the HPX code:

    Compile the converted code with the HPX library and run it using the appropriate HPX runtime environment.

OpenMP#

The OpenMP API supports multi-platform shared-memory parallel programming in C/C++. Typically it is used for loop-level parallelism, but it also supports function-level parallelism. Below are some examples on how to convert OpenMP to HPX code:

OpenMP parallel for loop#
Parallel for loop#

OpenMP code:

#pragma omp parallel for
for (int i = 0; i < n; ++i) {
    // loop body
}

HPX equivalent:

#include <hpx/algorithm.hpp>

hpx::experimental::for_loop(hpx::execution::par, 0, n, [&](int i) {
    // loop body
});

In the above code, the OpenMP #pragma omp parallel for directive is replaced with hpx::experimental::for_loop from the HPX library. The loop body within the lambda function will be executed in parallel for each iteration.

Private variables#

OpenMP code:

int x = 0;

#pragma omp parallel for private(x)
for (int i = 0; i < n; ++i) {
    // loop body
}

HPX equivalent:

#include <hpx/algorithm.hpp>

hpx::experimental::for_loop(hpx::execution::par, 0, n, [&](int i) {
        int x = 0; // Declare 'x' as a local variable inside the loop body
        // loop body
});

The variable x is declared as a local variable inside the loop body, ensuring that it is private to each thread.

Shared variables#

OpenMP code:

int x = 0;

#pragma omp parallel for shared(x)
for (int i = 0; i < n; ++i) {
    // loop body
}

HPX equivalent:

#include <hpx/algorithm.hpp>

std::atomic<int> x = 0; // Declare 'x' as a shared variable outside the loop

hpx::experimental::for_loop(hpx::execution::par, 0, n, [&](int i) {
    // loop body
});

To ensure variable x is shared among all threads, you simply have to declare it as an atomic variable outside the for_loop.

Number of threads#

OpenMP code:

#pragma omp parallel for num_threads(2)
for (int i = 0; i < n; ++i) {
    // loop body
}

HPX equivalent:

#include <hpx/algorithm.hpp>
#include <hpx/execution.hpp>

hpx::execution::experimental::num_cores nc(2);

hpx::experimental::for_loop(hpx::execution::par.with(nc), 0, n, [&](int i) {
    // loop body
});

To declare the number of threads to be used for the parallel region, you can use hpx::execution::experimental::num_cores and pass the number of cores (nc) to hpx::experimental::for_loop using hpx::execution::par.with(nc). This example uses 2 threads for the parallel loop.

Reduction#

OpenMP code:

int s = 0;

#pragma omp parallel for reduction(+: s)
for (int i = 0; i < n; ++i) {
    s += i;
    // loop body
}

HPX equivalent:

#include <hpx/algorithm.hpp>
#include <hpx/execution.hpp>

int s = 0;

hpx::experimental::for_loop(hpx::execution::par, 0, n, reduction(s, 0, plus<>()), [&](int i, int& accum) {
    accum += i;
    // loop body
});

The reduction clause specifies that the variable s should be reduced across iterations using the plus<> operation. It initializes s to 0 at the beginning of the loop and accumulates the values of s from each iteration using the + operator. The lambda function representing the loop body takes two parameters: i, which represents the loop index, and accum, which is the reduction variable s. The lambda function is executed for each iteration of the loop. The reduction ensures that the accum value is correctly accumulated across different iterations and threads.

Schedule#

OpenMP code:

int s = 0;

// static scheduling with chunk size 1000
#pragma omp parallel for schedule(static, 1000)
for (int i = 0; i < n; ++i) {
    // loop body
}

HPX equivalent:

#include <hpx/algorithm.hpp>
#include <hpx/execution.hpp>

hpx::execution::experimental::static_chunk_size cs(1000);

hpx::experimental::for_loop(hpx::execution::par.with(cs), 0, n, [&](int i) {
    // loop body
});

To define the scheduling type, you can use the corresponding execution policy from hpx::execution::experimental, define the chunk size (cs, here declared as 1000) and pass it to the to hpx::experimental::for_loop using hpx::execution::par.with(cs).

Accordingly, other types of scheduling are available and can be used in a similar manner:

#include <hpx/execution.hpp>
hpx::execution::experimental::dynamic_chunk_size cs(1000);
#include <hpx/execution.hpp>
hpx::execution::experimental::guided_chunk_size cs(1000);
#include <hpx/execution.hpp>
hpx::execution::experimental::auto_chunk_size cs(1000);
OpenMP single thread#

OpenMP code:

{   // parallel code
    #pragma omp single
    {
        // single-threaded code
    }
    // more parallel code
}

HPX equivalent:

#include <hpx/mutex.hpp>

hpx::mutex mtx;

{   // parallel code
    {   // single-threaded code
        std::scoped_lock l(mtx);
    }
    // more parallel code
}

To make sure that only one thread accesses a specific code within a parallel section you can use hpx::mutex and std::scoped_lock to take ownership of the given mutex mtx. For more information about mutexes please refer to Mutex.

OpenMP tasks#
Simple tasks#

OpenMP code:

// executed asynchronously by any available thread
#pragma omp task
{
    // task code
}

HPX equivalent:

#include <hpx/future.hpp>

auto future = hpx::async([](){
    // task code
});

or

#include <hpx/future.hpp>

hpx::post([](){
    // task code
}); // fire and forget

The tasks in HPX can be defined simply by using the async function and passing as argument the code you wish to run asynchronously. Another alternative is to use post which is a fire-and-forget method.

Tip

If you think you will like to synchronize your tasks later on, we suggest you use hpx::async which provides synchronization options, while hpx::post explicitly states that there is no return value or way to synchronize with the function execution. Synchronization options are listed below.

Task wait#

OpenMP code:

#pragma omp task
{
    // task code
}

#pragma omp taskwait
// code after completion of task

HPX equivalent:

#include <hpx/future.hpp>

hpx::async([](){
    // task code
}).get(); // wait for the task to complete

// code after completion of task

The get() function can be used to ensure that the task created with hpx::async is completed before the code continues executing beyond that point.

Multiple tasks synchronization#

OpenMP code:

#pragma omp task
{
    // task 1 code
}

#pragma omp task
{
    // task 2 code
}

#pragma omp taskwait
// code after completion of both tasks 1 and 2

HPX equivalent:

#include <hpx/future.hpp>

auto future1 = hpx::async([](){
    // task 1 code
});

auto future2 = hpx::async([](){
    // task 2 code
});

auto future = hpx::when_all(future1, future2).then([](auto&&){
    // code after completion of both tasks 1 and 2
});

If you would like to synchronize multiple tasks, you can use the hpx::when_all function to define which futures have to be ready and the then() function to declare what should be executed once these futures are ready.

Dependencies#

OpenMP code:

int a = 10;
int b = 20;
int c = 0;

#pragma omp task depend(in: a, b) depend(out: c)
{
    // task code
    c = 100;
}

HPX equivalent:

#include <hpx/future.hpp>

int a = 10;
int b = 20;
int c = 0;

// Create a future representing 'a'
auto future_a = hpx::make_ready_future(a);

// Create a future representing 'b'
auto future_b = hpx::make_ready_future(b);

// Create a task that depends on 'a' and 'b' and executes 'task_code'
auto future_c = hpx::dataflow(
    []() {
        // task code
        return 100;
    },
    future_a, future_b);

c = future_c.get();

If one of the arguments of hpx::dataflow is a future, then it will wait for the future to be ready to launch the thread. Hence, to define the dependencies of tasks you have to create futures representing the variables that create dependencies and pass them as arguments to hpx::dataflow. get() is used to save the result of the future to the desired variable.

Nested tasks#

OpenMP code:

#pragma omp task
{
    // Outer task code
    #pragma omp task
    {
        // Inner task code
    }
}

HPX equivalent:

#include <hpx/future.hpp>

auto future_outer = hpx::async([](){
    // Outer task code

    hpx::async([](){
        // Inner task code
    });
});

or

#include <hpx/future.hpp>

auto future_outer = hpx::post([](){ // fire and forget
    // Outer task code

    hpx::post([](){ // fire and forget
        // Inner task code
    });
});

If you have nested tasks, you can simply use nested hpx::async or hpx::post calls. The implementation is similar if you want to take care of synchronization:

OpenMP code:

#pragma omp taskwait
{
    // Outer task code
    #pragma omp taskwait
    {
        // Inner task code
    }
}

HPX equivalent:

#include <hpx/future.hpp>

auto future_outer = hpx::async([]() {
    // Outer task code

    hpx::async([]() {
        // Inner task code
    }).get();    // Wait for the inner task to complete
});

future_outer.get();    // Wait for the outer task to complete
Task yield#

OpenMP code:

#pragma omp task
{
    // code before yielding
    #pragma omp taskyield
    // code after yielding
}

HPX equivalent:

#include <hpx/future.hpp>
#include <hpx/thread.hpp>

auto future = hpx::async([](){
    // code before yielding
});

// yield execution to potentially allow other tasks to run
hpx::this_thread::yield();

// code after yielding

After creating a task using hpx::async, hpx::this_thread::yield can be used to reschedule the execution of threads, allowing other threads to run.

Task group#

OpenMP code:

#pragma omp taskgroup
{
    #pragma omp task
    {
        // task 1 code
    }

    #pragma omp task
    {
        // task 2 code
    }
}

HPX equivalent:

#include <hpx/task_group.hpp>

// Declare a task group
hpx::experimental::task_group tg;

// Run the tasks
tg.run([](){
    // task 1 code
});
tg.run(
    // task 2 code
});

// Wait for the task group
tg.wait();

To create task groups, you can use hpx::experimental::task_group. The function run() can be used to run each task within the task group, while wait() can be used to achieve synchronization. If you do not care about waiting for the task group to complete its execution, you can simply remove the wait() function.

OpenMP sections#

OpenMP code:

#pragma omp sections
{
    #pragma omp section
    // section 1 code
    #pragma omp section
    // section 2 code
} // implicit synchronization

HPX equivalent:

#include <hpx/future.hpp>

auto future_section1 = hpx::async([](){
    // section 1 code
});
auto future_section2 = hpx::async([](){
    // section 2 code
);

// synchronization: wait for both sections to complete
hpx::wait_all(future_section1, future_section2);

Unlike tasks, there is an implicit synchronization barrier at the end of each sections directive in OpenMP. This synchronization is achieved using hpx::wait_all function.

Note

If the nowait clause is used in the sections directive, then you can just remove the hpx::wait_all function while keeping the rest of the code as it is.

Intel Threading Building Blocks (TBB)#

Intel Threading Building Blocks (TBB) provides a high-level interface for parallelism and concurrent programming using standard ISO C++ code. Below are some examples on how to convert Intel Threading Building Blocks (TBB) to HPX code:

parallel_for#

Intel Threading Building Blocks (TBB) code:

auto values = std::vector<double>(10000);

tbb::parallel_for( tbb::blocked_range<int>(0,values.size()),
                    [&](tbb::blocked_range<int> r)
{
    for (int i=r.begin(); i<r.end(); ++i)
    {
        // loop body
    }
});

HPX equivalent:

#include <hpx/algorithm.hpp>

auto values = std::vector<double>(10000);

hpx::experimental::for_loop(hpx::execution::par, 0, values.size(), [&](int i) {
    // loop body
});

In the above code, tbb::parallel_for is replaced with hpx::experimental::for_loop from the HPX library. The loop body within the lambda function will be executed in parallel for each iteration.

parallel_for_each#

Intel Threading Building Blocks (TBB) code:

auto values = std::vector<double>(10000);

tbb::parallel_for_each(values.begin(), values.end(), [&](){
    // loop body
});

HPX equivalent:

#include <hpx/algorithm.hpp>

auto values = std::vector<double>(10000);

hpx::for_each(hpx::execution::par, values.begin(), values.end(), [&](){
    // loop body
});

By utilizing hpx::for_each and specifying a parallel execution policy with hpx::execution::par, it is possible to transform tbb::parallel_for_each into its equivalent counterpart in HPX.

parallel_invoke#

Intel Threading Building Blocks (TBB) code:

tbb::parallel_invoke(task1, task2, task3);

HPX equivalent:

#include <hpx/future.hpp>

hpx::wait_all(hpx::async(task1), hpx::async(task2), hpx::async(task3));

To convert tbb::parallel_invoke to HPX, we use hpx::async to asynchronously execute each task, which returns a future representing the result of each task. We then pass these futures to hpx::when_all, which waits for all the futures to complete before returning.

parallel_pipeline#

Intel Threading Building Blocks (TBB) code:

tbb::parallel_pipeline(4,
    tbb::make_filter<void, int>(tbb::filter::serial_in_order,
        [](tbb::flow_control& fc) -> int {
            // Generate numbers from 1 to 10
            static int i = 1;
            if (i <= 10) {
                return i++;
            }
            else {
                fc.stop();
                return 0;
            }
        }) &
    tbb::make_filter<int, int>(tbb::filter::parallel,
        [](int num) -> int {
            // Multiply each number by 2
            return num * 2;
        }) &
    tbb::make_filter<int, void>(tbb::filter::serial_in_order,
        [](int num) {
            // Print the results
            std::cout << num << " ";
        })
);

HPX equivalent:

#include <iostream>
#include <vector>
#include <ranges>
#include <hpx/algorithm.hpp>

// generate the values
auto range = std::views::iota(1) | std::views::take(10);

// materialize the output vector
std::vector<int> results(10);

// in parallel execution of pipeline and transformation
hpx::ranges::transform(
    hpx::execution::par, range, result.begin(), [](int i) { return 2 * i; });

// print the modified vector
for (int i : result)
{
    std::cout << i << " ";
}
std::cout << std::endl;

The line auto range = std::views::iota(1) | std::views::take(10); generates a range of values using the std::views::iota function. It starts from the value 1 and generates an infinite sequence of incrementing values. The std::views::take(10) function is then applied to limit the sequence to the first 10 values. The result is stored in the range variable.

Hint

A view is a lightweight object that represents a particular view of a sequence or range. It acts as a read-only interface to the original data, providing a way to query and traverse the elements without making any copies or modifications.

Views can be composed and chained together to form complex pipelines of operations. These operations are evaluated lazily, meaning that the actual computation is performed only when the result is needed or consumed.

Since views perform lazy evaluation, we use std::vector<int> results(10); to meterialize the vector that will store the transformed values. The hpx::ranges::transform function is then used to perform a parallel transformation on the range. The transformed values will be written to the results vector.

Hint

Ranges enable loop fusion by combining multiple operations into a single parallel loop, eliminating waiting time and reducing overhead. Using ranges, you can express these operations as a pipeline of transformations on a sequence of elements. This pipeline is evaluated in a single pass, performing all the desired operations in parallel without the need to wait between them.

In addition, HPX enhances the benefits of range fusion by offering parallel execution policies, which can be used to optimize the execution of the fused loop across multiple threads.

parallel_reduce#
Reduction#

Intel Threading Building Blocks (TBB) code:

auto values = std::vector<double>{1,2,3,4,5,6,7,8,9};

auto total = tbb::parallel_reduce(
                tbb::blocked_range<int>(0,values.size()),
                0.0,
                [&](tbb::blocked_range<int> r, double running_total)
                {
                    for (int i=r.begin(); i<r.end(); ++i)
                    {
                        running_total += values[i];
                    }


                    return running_total;
                },
                std::plus<double>());

HPX equivalent:

#include <hpx/numeric.hpp>

auto values = std::vector<double>{1,2,3,4,5,6,7,8,9};

auto total = hpx::reduce(
    hpx::execution::par, values.begin(), values.end(), 0, std::plus{});

By utilizing hpx::reduce and specifying a parallel execution policy with hpx::execution::par, it is possible to transform tbb::parallel_reduce into its equivalent counterpart in HPX. As demonstrated in the previous example, the management of intermediate results is seamlessly handled internally by HPX, eliminating the need for explicit consideration.

Transformation & Reduction#

Intel Threading Building Blocks (TBB) code:

auto values = std::vector<double>{1,2,3,4,5,6,7,8,9};

auto transform_function(double current_value){
    // transformation code
}

auto total = tbb::parallel_reduce(
                tbb::blocked_range<int>(0,values.size()),
                0.0,
                [&](tbb::blocked_range<int> r, double transformed_val)
                {
                    for (int i=r.begin(); i<r.end(); ++i)
                    {
                        transformed_val += transform_function(values[i]);
                    }
                    return transformed_val;
                },
                std::plus<double>());

HPX equivalent:

#include <hpx/numeric.hpp>

auto values = std::vector<double>{1,2,3,4,5,6,7,8,9};

auto transform_function(double current_value)
{
    // transformation code
}

auto total = hpx::transform_reduce(hpx::execution::par, values.begin(),
    values.end(), 0, std::plus{},
    [&](double current_value) { return transform_function(current_value); });

In situations where certain values require transformation before the reduction process, HPX provides a straightforward solution through hpx::transform_reduce. The transform_function() allows for the application of the desired transformation to each value.

parallel_scan#

Intel Threading Building Blocks (TBB) code:

tbb::parallel_scan(tbb::blocked_range<size_t>(0, input.size()),
    0,
    [&input, &output](const tbb::blocked_range<size_t>& range, int& partial_sum, bool is_final_scan) {
        for (size_t i = range.begin(); i != range.end(); ++i) {
            partial_sum += input[i];
            if (is_final_scan) {
                output[i] = partial_sum;
            }
        }
        return partial_sum;
    },
    [](int left_sum, int right_sum) {
        return left_sum + right_sum;
    }
);

HPX equivalent:

#include <hpx/numeric.hpp>

hpx::inclusive_scan(hpx::execution::par, input.begin(), input.end(),
    output.begin(),
    [](const int& left, const int& right) { return left + right; });

hpx::inclusive_scan with hpx::execution::par as execution policy can be used to perform a prefix scan in parallel. The management of intermediate results is seamlessly handled internally by HPX, eliminating the need for explicit consideration. input.begin() and input.end() refer to the beginning and end of the sequence of elements the algorithm will be applied to respectively. output.begin() refers to the beginning of the destination, while the last argument specifies the function which will be invoked for each of the values of the input sequence.

Apart from hpx::inclusive_scan, HPX provides its users with hpx::exclusive_scan. The key difference between inclusive scan and exclusive scan lies in the treatment of the current element during the scan operation. In an inclusive scan, each element in the output sequence includes the contribution of the corresponding element in the input sequence, while in an exclusive scan, the current element in the input sequence does not contribute to the corresponding element in the output sequence.

parallel_sort#

Intel Threading Building Blocks (TBB) code:

std::vector<int> numbers = {9, 2, 7, 1, 5, 3};

tbb::parallel_sort(numbers.begin(), numbers.end());

HPX equivalent:

#include <hpx/algorithm.hpp>

std::vector<int> numbers = {9, 2, 7, 1, 5, 3};

hpx::sort(hpx::execution::par, numbers.begin(), numbers.end());

hpx::sort provides an equivalent functionality to tbb::parallel_sort. When given a parallel execution policy with hpx::execution::par, the algorithm employs parallel execution, allowing for efficient sorting across available threads.

task_group#

Intel Threading Building Blocks (TBB) code:

// Declare a task group
tbb::task_group tg;

// Run the tasks
tg.run(task1);
tg.run(task2);

// Wait for the task group
tg.wait();

HPX equivalent:

#include <hpx/task_group.hpp>

// Declare a task group
hpx::experimental::task_group tg;

// Run the tasks
tg.run(task1);
tg.run(task2);

// Wait for the task group
tg.wait();

HPX drew inspiration from Intel Threading Building Blocks (TBB) to introduce the hpx::experimental::task_group feature. Therefore, utilizing hpx::experimental::task_group provides an equivalent functionality to tbb::task_group.

MPI#

MPI is a standardized communication protocol and library that allows multiple processes or nodes in a parallel computing system to exchange data and coordinate their execution.

List of MPI-HPX functions#

MPI function

HPX equivalent

MPI_Allgather

hpx::collectives::all_gather

MPI_Allreduce

hpx::collectives::all_reduce

MPI_Alltoall

hpx::collectives::all_to_all

MPI_Barrier

hpx::distributed::barrier

MPI_Bcast

hpx::collectives::broadcast_to() and hpx::collectives::broadcast_from() used with get()

MPI_Comm_size

hpx::get_num_localities

MPI_Comm_rank

hpx::get_locality_id()

MPI_Exscan

hpx::collectives::exclusive_scan() used with get()

MPI_Gather

hpx::collectives::gather_here() and hpx::collectives::gather_there() used with get()

MPI_Irecv

hpx::collectives::get()

MPI_Isend

hpx::collectives::set()

MPI_Reduce

hpx::collectives::reduce_here and hpx::collectives::reduce_there used with get()

MPI_Scan

hpx::collectives::inclusive_scan() used with get()

MPI_Scatter

hpx::collectives::scatter_to() and hpx::collectives::scatter_from()

MPI_Wait

hpx::collectives::get() used with a future i.e. setf.get()

MPI_Send & MPI_Recv#

Let’s assume we have the following simple message passing code where each process sends a message to the next process in a circular manner. The exchanged message is modified and printed to the console.

MPI code:

#include <cstddef>
#include <cstdint>
#include <iostream>
#include <mpi.h>
#include <vector>

constexpr int times = 2;

int main(int argc, char *argv[]) {
MPI_Init(&argc, &argv);

int num_localities;
MPI_Comm_size(MPI_COMM_WORLD, &num_localities);

int this_locality;
MPI_Comm_rank(MPI_COMM_WORLD, &this_locality);

int next_locality = (this_locality + 1) % num_localities;
std::vector<int> msg_vec = {0, 1};

int cnt = 0;
int msg = msg_vec[this_locality];

int recv_msg;
MPI_Request request_send, request_recv;
MPI_Status status;

while (cnt < times) {
    cnt += 1;

    MPI_Isend(&msg, 1, MPI_INT, next_locality, cnt, MPI_COMM_WORLD,
            &request_send);
    MPI_Irecv(&recv_msg, 1, MPI_INT, next_locality, cnt, MPI_COMM_WORLD,
            &request_recv);

    MPI_Wait(&request_send, &status);
    MPI_Wait(&request_recv, &status);

    std::cout << "Time: " << cnt << ", Locality " << this_locality
            << " received msg: " << recv_msg << "\n";

    recv_msg += 10;
    msg = recv_msg;
}

MPI_Finalize();
return 0;
}

HPX equivalent:

#include <hpx/config.hpp>

#if !defined(HPX_COMPUTE_DEVICE_CODE)
#include <hpx/algorithm.hpp>
#include <hpx/hpx_init.hpp>
#include <hpx/modules/collectives.hpp>

#include <cstddef>
#include <cstdint>
#include <iostream>
#include <utility>
#include <vector>

using namespace hpx::collectives;

constexpr char const* channel_communicator_name =
    "/example/channel_communicator/";

// the number of times
constexpr int times = 2;

int hpx_main()
{
    std::uint32_t num_localities = hpx::get_num_localities(hpx::launch::sync);
    std::uint32_t this_locality = hpx::get_locality_id();

    // allocate channel communicator
    auto comm = create_channel_communicator(hpx::launch::sync,
        channel_communicator_name, num_sites_arg(num_localities),
        this_site_arg(this_locality));

    std::uint32_t next_locality = (this_locality + 1) % num_localities;
    std::vector<int> msg_vec = {0, 1};

    int cnt = 0;
    int msg = msg_vec[this_locality];

    // send values to another locality
    auto setf = set(comm, that_site_arg(next_locality), msg, tag_arg(cnt));
    auto got_msg = get<int>(comm, that_site_arg(next_locality), tag_arg(cnt));

    setf.get();

    while (cnt < times)
    {
        cnt += 1;

        auto done_msg = got_msg.then([&](auto&& f) {
            int rec_msg = f.get();
            std::cout << "Time: " << cnt << ", Locality " << this_locality
                      << " received msg: " << rec_msg << "\n";

            // change msg by adding 10
            rec_msg += 10;

            // start next round
            setf =
                set(comm, that_site_arg(next_locality), rec_msg, tag_arg(cnt));
            got_msg =
                get<int>(comm, that_site_arg(next_locality), tag_arg(cnt));
            setf.get();
        });

        done_msg.get();
    }

    return hpx::finalize();
}
#endif

int main(int argc, char* argv[])
{
#if !defined(HPX_COMPUTE_DEVICE_CODE)
    hpx::init_params params;
    params.cfg = {"--hpx:run-hpx-main"};
    return hpx::init(argc, argv, params);
#else
    (void) argc;
    (void) argv;
    return 0;
#endif
}

To perform message passing between different processes in HPX we can use a channel communicator. To understand this example, let’s focus on the hpx_main() function:

  • hpx::get_num_localities(hpx::launch::sync) retrieves the number of localities, while hpx::get_locality_id() returns the ID of the current locality.

  • create_channel_communicator function is used to create a channel to serve the communication. This function takes several arguments, including the launch policy (hpx::launch::sync), the name of the communicator (channel_communicator_name), the number of localities, and the ID of the current locality.

  • The communication follows a ring pattern, where each process (or locality) sends a message to its neighbor in a circular manner. This means that the messages circulate around the localities, ensuring that the communication wraps around when reaching the end of the locality sequence. To achieve this, the next_locality variable is calculated as the ID of the next locality in the ring.

  • The initial values for the communication are set (msg_vec, cnt, msg).

  • The set() function is called to send the message to the next locality in the ring. The message is sent asynchronously and is associated with a tag (cnt).

  • The get() function is called to receive a message from the next locality. It is also associated with the same tag as the set() operation.

  • The setf.get() call blocks until the message sending operation is complete.

  • A continuation is set up using the function then() to handle the received message. Inside the continuation:

    • The received message value (rec_msg) is retrieved using f.get().

    • The received message is printed to the console and then modified by adding 10.

    • The set() and get() operations are repeated to send and receive the modified message to the next locality.

    • The setf.get() call blocks until the new message sending operation is complete.

  • The done_msg.get() call blocks until the continuation is complete for the current loop iteration.

Having said that, we conclude to the following table:

MPI_Gather#

The following code gathers data from all processes to the root process and verifies the gathered data in the root process.

MPI code:

#include <iostream>
#include <mpi.h>
#include <numeric>
#include <vector>

int main(int argc, char *argv[]) {
    MPI_Init(&argc, &argv);

    int num_localities, this_locality;
    MPI_Comm_size(MPI_COMM_WORLD, &num_localities);
    MPI_Comm_rank(MPI_COMM_WORLD, &this_locality);

    std::vector<int> local_data; // Data to be gathered

    if (this_locality == 0) {
        local_data.resize(num_localities); // Resize the vector on the root process
    }

    // Each process calculates its local data value
    int my_data = 42 + this_locality;

    for (std::uint32_t i = 0; i != 10; ++i) {

        // Gather data from all processes to the root process (process 0)
        MPI_Gather(&my_data, 1, MPI_INT, local_data.data(), 1, MPI_INT, 0,
                MPI_COMM_WORLD);

        // Only the root process (process 0) will print the gathered data
        if (this_locality == 0) {
        std::cout << "Gathered data on the root: ";
        for (int i = 0; i < num_localities; ++i) {
            std::cout << local_data[i] << " ";
        }
        std::cout << std::endl;
        }
    }
    std::cout << std::endl;

    MPI_Finalize();
    return 0;
}

HPX equivalent:

std::uint32_t num_localities = hpx::get_num_localities(hpx::launch::sync);
std::uint32_t this_locality = hpx::get_locality_id();

// test functionality based on immediate local result value
auto gather_direct_client = create_communicator(gather_direct_basename,
    num_sites_arg(num_localities), this_site_arg(this_locality));

for (std::uint32_t i = 0; i != 10; ++i)
{
    if (this_locality == 0)
    {
        hpx::future<std::vector<std::uint32_t>> overall_result =
            gather_here(gather_direct_client, std::uint32_t(42));

        std::vector<std::uint32_t> sol = overall_result.get();
        std::cout << "Gathered data on the root:";

        for (std::size_t j = 0; j != sol.size(); ++j)
        {
            HPX_TEST(j + 42 == sol[j]);
            std::cout << " " << sol[j];
        }
        std::cout << std::endl;
    }
    else
    {
        hpx::future<void> overall_result =
            gather_there(gather_direct_client, this_locality + 42);
        overall_result.get();
    }

}

This code will print 10 times the following message:

Gathered data on the root: 42 43

HPX uses two functions to implement the functionality of MPI_Gather: gather_here and gather_there. gather_here is gathering data from all localities to the locality with ID 0 (root locality). gather_there allows non-root localities to participate in the gather operation by sending data to the root locality. In more detail:

  • hpx::get_num_localities(hpx::launch::sync) retrieves the number of localities, while hpx::get_locality_id() returns the ID of the current locality.

  • The function create_communicator() is used to create a communicator called gather_direct_client.

  • If the current locality is the root (its ID is equal to 0):

    • The gather_here function is used to perform the gather operation. It collects data from all other localities into the overall_result future object. The function arguments provide the necessary information, such as the base name for the gather operation (gather_direct_basename), the value to be gathered (value), the number of localities (num_localities), the current locality ID (this_locality), and the generation number (related to the gather operation).

    • The get() member function of the overall_result future is used to retrieve the gathered data.

    • The next for loop is used to verify the correctness of the gathered data (sol). HPX_TEST is a macro provided by the HPX testing utilities to perform similar testing with the Standard C++ macro assert.

  • If the current locality is not the root:

    • The gather_there function is used to participate in the gather operation initiated by the root locality. It sends the data (in this case, the value this_locality + 42) to the root locality, indicating that it should be included in the gathering.

    • The get() member function of the overall_result future is used to wait for the gather operation to complete for this locality.

MPI_Scatter#

The following code gathers data from all processes to the root process and verifies the gathered data in the root process.

MPI code:

#include <iostream>
#include <mpi.h>
#include <vector>

int main(int argc, char *argv[]) {
    MPI_Init(&argc, &argv);

    int num_localities, this_locality;
    MPI_Comm_size(MPI_COMM_WORLD, &num_localities);
    MPI_Comm_rank(MPI_COMM_WORLD, &this_locality);

    int num_localities = num_localities;
    std::vector<int> data(num_localities);

    if (this_locality == 0) {
        // Fill the data vector on the root locality (locality 0)
        for (int i = 0; i < num_localities; ++i) {
        data[i] = 42 + i;
        }
    }

    int local_data; // Variable to store the received data

    // Scatter data from the root locality to all other localities
    MPI_Scatter(&data[0], 1, MPI_INT, &local_data, 1, MPI_INT, 0, MPI_COMM_WORLD);

    // Now, each locality has its own local_data

    // Print the local_data on each locality
    std::cout << "Locality " << this_locality << " received " << local_data
                << std::endl;

    MPI_Finalize();
    return 0;
}

HPX equivalent:

std::uint32_t num_localities = hpx::get_num_localities(hpx::launch::sync);
HPX_TEST_LTE(std::uint32_t(2), num_localities);

std::uint32_t this_locality = hpx::get_locality_id();

auto scatter_direct_client =
    hpx::collectives::create_communicator(scatter_direct_basename,
        num_sites_arg(num_localities), this_site_arg(this_locality));

// test functionality based on immediate local result value
for (std::uint32_t i = 0; i != 10; ++i)
{
    if (this_locality == 0)
    {
        std::vector<std::uint32_t> data(num_localities);
        std::iota(data.begin(), data.end(), 42 + i);

        hpx::future<std::uint32_t> result =
            scatter_to(scatter_direct_client, std::move(data));

        HPX_TEST_EQ(i + 42 + this_locality, result.get());
    }
    else
    {
        hpx::future<std::uint32_t> result =
            scatter_from<std::uint32_t>(scatter_direct_client);

        HPX_TEST_EQ(i + 42 + this_locality, result.get());

        std::cout << "Locality " << this_locality << " received "
                  << i + 42 + this_locality << std::endl;
    }
}

For num_localities = 2 and since we run for 10 iterations this code will print the following message:

Locality 1 received 43
Locality 1 received 44
Locality 1 received 45
Locality 1 received 46
Locality 1 received 47
Locality 1 received 48
Locality 1 received 49
Locality 1 received 50
Locality 1 received 51
Locality 1 received 52

HPX uses two functions to implement the functionality of MPI_Scatter: hpx::scatter_to and hpx::scatter_from. hpx::scatter_to is distributing the data from the locality with ID 0 (root locality) to all other localities. hpx::scatter_from allows non-root localities to receive the data from the root locality. In more detail:

  • hpx::get_num_localities(hpx::launch::sync) retrieves the number of localities, while hpx::get_locality_id() returns the ID of the current locality.

  • The function hpx::collectives::create_communicator() is used to create a communicator called scatter_direct_client.

  • If the current locality is the root (its ID is equal to 0):

    • The data vector is filled with values ranging from 42 + i to 42 + i + num_localities - 1.

    • The hpx::scatter_to function is used to perform the scatter operation using the communicator scatter_direct_client. This scatters the data vector to other localities and returns a future representing the result.

    • HPX_TEST_EQ is a macro provided by the HPX testing utilities to test the distributed values.

  • If the current locality is not the root:

    • The hpx::scatter_from function is used to collect the data by the root locality.

    • HPX_TEST_EQ is a macro provided by the HPX testing utilities to test the collected values.

MPI_Allgather#

The following code gathers data from all processes and sends the data to all processes.

MPI code:

#include <cstdint>
#include <iostream>
#include <mpi.h>
#include <vector>

int main(int argc, char **argv) {
    MPI_Init(&argc, &argv);

    int rank, size;
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    // Get the number of MPI processes
    int num_localities = size;

    // Get the MPI process rank
    int here = rank;

    std::uint32_t value = here;

    std::vector<std::uint32_t> r(num_localities);

    // Perform an all-gather operation to gather values from all processes.
    MPI_Allgather(&value, 1, MPI_UINT32_T, r.data(), 1, MPI_UINT32_T,
                    MPI_COMM_WORLD);

    // Print the result.
    std::cout << "Locality " << here << " has values:";
    for (size_t j = 0; j < r.size(); ++j) {
        std::cout << " " << r[j];
    }
    std::cout << std::endl;

    MPI_Finalize();
    return 0;
}

HPX equivalent:

std::uint32_t num_localities = hpx::get_num_localities(hpx::launch::sync);
std::uint32_t here = hpx::get_locality_id();

// test functionality based on immediate local result value
auto all_gather_direct_client =
    create_communicator(all_gather_direct_basename,
        num_sites_arg(num_localities), this_site_arg(here));

std::uint32_t value = here;

hpx::future<std::vector<std::uint32_t>> overall_result =
    all_gather(all_gather_direct_client, value);

std::vector<std::uint32_t> r = overall_result.get();

std::cout << "Locality " << here << " has values:";
for (std::size_t j = 0; j != r.size(); ++j)
{
    std::cout << " " << j;
}
std::cout << std::endl;

For num_localities = 2 this code will print the following message:

Locality 0 has values: 0 1
Locality 1 has values: 0 1

HPX uses the function all_gather to implement the functionality of MPI_Allgather. In more detail:

  • hpx::get_num_localities(hpx::launch::sync) retrieves the number of localities, while hpx::get_locality_id() returns the ID of the current locality.

  • The function hpx::collectives::create_communicator() is used to create a communicator called all_gather_direct_client.

  • The values that the localities exchange with each other are equal to each locality’s ID.

  • The gather operation is performed using all_gather. The result is stored in an hpx::future object called overall_result, which represents a future result that can be retrieved later when needed.

  • The get() function waits until the result is available and then stores it in the vector called r.

MPI_Allreduce#

The following code combines values from all processes and distributes the result back to all processes.

MPI code:

#include <cstdint>
#include <iostream>
#include <mpi.h>

int main(int argc, char **argv) {
    MPI_Init(&argc, &argv);

    int rank, size;
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    // Get the number of MPI processes
    int num_localities = size;

    // Get the MPI process rank
    int here = rank;

    // Create a communicator for the all reduce operation.
    MPI_Comm all_reduce_direct_client;
    MPI_Comm_split(MPI_COMM_WORLD, 0, rank, &all_reduce_direct_client);

    // Perform the all reduce operation to calculate the sum of 'here' values.
    std::uint32_t value = here;
    std::uint32_t res = 0;
    MPI_Allreduce(&value, &res, 1, MPI_UINT32_T, MPI_SUM,
                    all_reduce_direct_client);

    std::cout << "Locality " << rank << " has value: " << res << std::endl;

    MPI_Finalize();
    return 0;
}

HPX equivalent:

std::uint32_t const num_localities =
    hpx::get_num_localities(hpx::launch::sync);
std::uint32_t const here = hpx::get_locality_id();

auto const all_reduce_direct_client =
    create_communicator(all_reduce_direct_basename,
        num_sites_arg(num_localities), this_site_arg(here));

std::uint32_t value = here;

hpx::future<std::uint32_t> overall_result =
    all_reduce(all_reduce_direct_client, value, std::plus<std::uint32_t>{});

std::uint32_t res = overall_result.get();
std::cout << "Locality " << here << " has value: " << res << std::endl;

For num_localities = 2 this code will print the following message:

Locality 0 has value: 1
Locality 1 has value: 1

HPX uses the function all_reduce to implement the functionality of MPI_Allreduce. In more detail:

  • hpx::get_num_localities(hpx::launch::sync) retrieves the number of localities, while hpx::get_locality_id() returns the ID of the current locality.

  • The function hpx::collectives::create_communicator() is used to create a communicator called all_reduce_direct_client.

  • The value of each locality is equal to its ID.

  • The reduce operation is performed using all_reduce. The result is stored in an hpx::future object called overall_result, which represents a future result that can be retrieved later when needed.

  • The get() function waits until the result is available and then stores it in the variable res.

MPI_Alltoall#

The following code gathers data from and scatters data to all processes.

MPI code:

#include <algorithm>
#include <cstdint>
#include <iostream>
#include <mpi.h>
#include <vector>

int main(int argc, char **argv) {
    MPI_Init(&argc, &argv);

    int rank, size;
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    // Get the number of MPI processes
    int num_localities = size;

    // Get the MPI process rank
    int this_locality = rank;

    // Create a communicator for all-to-all operation.
    MPI_Comm all_to_all_direct_client;
    MPI_Comm_split(MPI_COMM_WORLD, 0, rank, &all_to_all_direct_client);

    std::vector<std::uint32_t> values(num_localities);
    std::fill(values.begin(), values.end(), this_locality);

    // Create vectors to store received values.
    std::vector<std::uint32_t> r(num_localities);

    // Perform an all-to-all operation to exchange values with other localities.
    MPI_Alltoall(values.data(), 1, MPI_UINT32_T, r.data(), 1, MPI_UINT32_T,
                all_to_all_direct_client);

    // Print the results.
    std::cout << "Locality " << this_locality << " has values:";
    for (std::size_t j = 0; j != r.size(); ++j) {
        std::cout << " " << r[j];
    }
    std::cout << std::endl;

    MPI_Finalize();
    return 0;
}

HPX equivalent:

std::uint32_t num_localities = hpx::get_num_localities(hpx::launch::sync);
std::uint32_t this_locality = hpx::get_locality_id();

auto all_to_all_direct_client =
    create_communicator(all_to_all_direct_basename,
        num_sites_arg(num_localities), this_site_arg(this_locality));

std::vector<std::uint32_t> values(num_localities);
std::fill(values.begin(), values.end(), this_locality);

hpx::future<std::vector<std::uint32_t>> overall_result =
    all_to_all(all_to_all_direct_client, std::move(values));

std::vector<std::uint32_t> r = overall_result.get();
std::cout << "Locality " << this_locality << " has values:";

for (std::size_t j = 0; j != r.size(); ++j)
{
    std::cout << " " << r[j];
}
std::cout << std::endl;

For num_localities = 2 this code will print the following message:

Locality 0 has values: 0 1
Locality 1 has values: 0 1

HPX uses the function all_to_all to implement the functionality of MPI_Alltoall. In more detail:

  • hpx::get_num_localities(hpx::launch::sync) retrieves the number of localities, while hpx::get_locality_id() returns the ID of the current locality.

  • The function hpx::collectives::create_communicator() is used to create a communicator called all_to_all_direct_client.

  • The value each locality sends is equal to its ID.

  • The all-to-all operation is performed using all_to_all. The result is stored in an hpx::future object called overall_result, which represents a future result that can be retrieved later when needed.

  • The get() function waits until the result is available and then stores it in the variable r.

MPI_Barrier#

The following code shows how barrier is used to synchronize multiple processes.

MPI code:

#include <cstdlib>
#include <iostream>
#include <mpi.h>

int main(int argc, char **argv) {
    MPI_Init(&argc, &argv);

    std::size_t iterations = 5;

    int rank, size;
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    for (std::size_t i = 0; i != iterations; ++i) {
        MPI_Barrier(MPI_COMM_WORLD);
        if (rank == 0) {
        std::cout << "Iteration " << i << " completed." << std::endl;
        }
    }

    MPI_Finalize();
    return 0;
}

HPX equivalent:

std::size_t iterations = 5;
std::uint32_t this_locality = hpx::get_locality_id();

char const* const barrier_test_name = "/test/barrier/multiple";

hpx::distributed::barrier b(barrier_test_name);
for (std::size_t i = 0; i != iterations; ++i)
{
    b.wait();
    if (this_locality == 0)
    {
        std::cout << "Iteration " << i << " completed." << std::endl;
    }
}

This code will print the following message:

Iteration 0 completed.
Iteration 1 completed.
Iteration 2 completed.
Iteration 3 completed.
Iteration 4 completed.

HPX uses the function barrier to implement the functionality of MPI_Barrier. In more detail:

  • After defining the number of iterations, we use hpx::get_locality_id() to get the ID of the current locality.

  • char const* const barrier_test_name = “/test/barrier/multiple”: This line defines a constant character array as the name of the barrier. This name is used to identify the barrier across different localities. All participating threads that use this name will synchronize at this barrier.

  • Using hpx::distributed::barrier b(barrier_test_name), we create an instance of the distributed barrier with the previously defined name. This barrier will be used to synchronize the execution of threads across different localities.

  • Running for all the desired iterations, we use b.wait() to synchronize the threads. Each thread waits until all other threads also reach this point before any of them can proceed further.

MPI_Bcast#

The following code broadcasts data from one process to all other processes.

MPI code:

#include <iostream>
#include <mpi.h>

int main(int argc, char *argv[]) {
    MPI_Init(&argc, &argv);

    int num_localities;
    MPI_Comm_size(MPI_COMM_WORLD, &num_localities);

    int here;
    MPI_Comm_rank(MPI_COMM_WORLD, &here);

    int value;

    for (int i = 0; i < 5; ++i) {
        if (here == 0) {
            value = i + 42;
        }

        // Broadcast the value from process 0 to all other processes
        MPI_Bcast(&value, 1, MPI_INT, 0, MPI_COMM_WORLD);

        if (here != 0) {
            std::cout << "Locality " << here << " received " << value << std::endl;
        }

    }

    MPI_Finalize();
    return 0;
}

HPX equivalent:

std::uint32_t num_localities = hpx::get_num_localities(hpx::launch::sync);

std::uint32_t here = hpx::get_locality_id();

auto broadcast_direct_client =
    create_communicator(broadcast_direct_basename,
        num_sites_arg(num_localities), this_site_arg(here));

// test functionality based on immediate local result value
for (std::uint32_t i = 0; i != 5; ++i)
{
    if (here == 0)
    {
        hpx::future<std::uint32_t> result =
            broadcast_to(broadcast_direct_client, i + 42);

        result.get();
    }
    else
    {
        hpx::future<std::uint32_t> result =
            hpx::collectives::broadcast_from<std::uint32_t>(
                broadcast_direct_client);

        uint32_t r = result.get();

        std::cout << "Locality " << here << " received " << r << std::endl;
    }
}

For num_localities = 2 this code will print the following message:

Locality 1 received 42
Locality 1 received 43
Locality 1 received 44
Locality 1 received 45
Locality 1 received 46

HPX uses two functions to implement the functionality of MPI_Bcast: broadcast_to and broadcast_from. broadcast_to is broadcasting the data from the root locality to all other localities. broadcast_from allows non-root localities to collect the data sent by the root locality. In more detail:

  • hpx::get_num_localities(hpx::launch::sync) retrieves the number of localities, while hpx::get_locality_id() returns the ID of the current locality.

  • The function create_communicator() is used to create a communicator called broadcast_direct_client.

  • If the current locality is the root (its ID is equal to 0):

    • The broadcast_to function is used to perform the broadcast operation using the communicator broadcast_direct_client. This sends the data to other localities and returns a future representing the result.

    • The get() member function of the result future is used to wait for and retrieve the result.

  • If the current locality is not the root:

    • The broadcast_from function is used to collect the data by the root locality.

    • The get() member function of the result future is used to wait for the result.

MPI_Exscan#

The following code computes the exclusive scan (partial reductions) of data on a collection of processes.

MPI code:

#include <iostream>
#include <mpi.h>
#include <numeric>
#include <vector>

int main(int argc, char *argv[]) {
    MPI_Init(&argc, &argv);

    int num_localities;
    MPI_Comm_size(MPI_COMM_WORLD, &num_localities);

    int here;
    MPI_Comm_rank(MPI_COMM_WORLD, &here);

    // Calculate the value for this locality (here)
    int value = here;

    // Perform an exclusive scan
    std::vector<int> result(num_localities);
    MPI_Exscan(&value, &result[0], 1, MPI_INT, MPI_SUM, MPI_COMM_WORLD);

    if (here != 0) {
        int r = result[here - 1]; // Result is in the previous rank's slot

        std::cout << "Locality " << here << " has value " << r << std::endl;
    }

    MPI_Finalize();
    return 0;
}

HPX equivalent:

std::uint32_t num_localities = hpx::get_num_localities(hpx::launch::sync);
std::uint32_t here = hpx::get_locality_id();

auto exclusive_scan_client = create_communicator(exclusive_scan_basename,
    num_sites_arg(num_localities), this_site_arg(here));

// test functionality based on immediate local result value
std::uint32_t value = here;

hpx::future<std::uint32_t> overall_result = exclusive_scan(
    exclusive_scan_client, value, std::plus<std::uint32_t>{});

uint32_t r = overall_result.get();

if (here != 0)
{
    std::cout << "Locality " << here << " has value " << r << std::endl;
}

For num_localities = 2 this code will print the following message:

Locality 1 has value 0

HPX uses the function exclusive_scan to implement MPI_Exscan. In more detail:

  • hpx::get_num_localities(hpx::launch::sync) retrieves the number of localities, while hpx::get_locality_id() returns the ID of the current locality.

  • The function create_communicator() is used to create a communicator called exclusive_scan_client.

  • The exclusive_scan function is used to perform the exclusive scan operation using the communicator exclusive_scan_client. std::plus<std::uint32_t>{} specifies the binary associative operator to use for the scan. In this case, it’s addition for summing values.

  • The get() member function of the overall_result future is used to wait for the result.

MPI_Scan#

The following code Computes the inclusive scan (partial reductions) of data on a collection of processes.

MPI code:

#include <iostream>
#include <mpi.h>
#include <numeric>
#include <vector>

int main(int argc, char *argv[]) {
    MPI_Init(&argc, &argv);

    int num_localities;
    MPI_Comm_size(MPI_COMM_WORLD, &num_localities);

    int here;
    MPI_Comm_rank(MPI_COMM_WORLD, &here);

    // Calculate the value for this locality (here)
    int value = here;

    std::vector<int> result(num_localities);

    MPI_Scan(&value, &result[0], 1, MPI_INT, MPI_SUM, MPI_COMM_WORLD);

    std::cout << "Locality " << here << " has value " << result[0] << std::endl;

    MPI_Finalize();
    return 0;
}

HPX equivalent:

std::uint32_t num_localities = hpx::get_num_localities(hpx::launch::sync);
std::uint32_t here = hpx::get_locality_id();

auto inclusive_scan_client = create_communicator(inclusive_scan_basename,
    num_sites_arg(num_localities), this_site_arg(here));

std::uint32_t value = here;

hpx::future<std::uint32_t> overall_result = inclusive_scan(
    inclusive_scan_client, value, std::plus<std::uint32_t>{});

uint32_t r = overall_result.get();

std::cout << "Locality " << here << " has value " << r << std::endl;

For num_localities = 2 this code will print the following message:

Locality 0 has value 0
Locality 1 has value 1

HPX uses the function inclusive_scan to implement MPI_Scan. In more detail:

  • hpx::get_num_localities(hpx::launch::sync) retrieves the number of localities, while hpx::get_locality_id() returns the ID of the current locality.

  • The function create_communicator() is used to create a communicator called inclusive_scan_client.

  • The inclusive_scan function is used to perform the exclusive scan operation using the communicator inclusive_scan_client. std::plus<std::uint32_t>{} specifies the binary associative operator to use for the scan. In this case, it’s addition for summing values.

  • The get() member function of the overall_result future is used to wait for the result.

MPI_Reduce#

The following code performs a global reduce operation across all processes.

MPI code:

#include <iostream>
#include <mpi.h>

int main(int argc, char *argv[]) {
    MPI_Init(&argc, &argv);

    int num_processes;
    MPI_Comm_size(MPI_COMM_WORLD, &num_processes);

    int this_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &this_rank);

    int value = this_rank;

    int result = 0;

    // Perform the reduction operation
    MPI_Reduce(&value, &result, 1, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD);

    // Print the result for the root process (process 0)
    if (this_rank == 0) {
        std::cout << "Locality " << this_rank << " has value " << result
                << std::endl;
    }

    MPI_Finalize();
    return 0;
}

HPX equivalent:

std::uint32_t num_localities = hpx::get_num_localities(hpx::launch::sync);
std::uint32_t this_locality = hpx::get_locality_id();

auto reduce_direct_client = create_communicator(reduce_direct_basename,
    num_sites_arg(num_localities), this_site_arg(this_locality));

std::uint32_t value = hpx::get_locality_id();

if (this_locality == 0)
{
    hpx::future<std::uint32_t> overall_result = reduce_here(
        reduce_direct_client, value, std::plus<std::uint32_t>{});

    uint32_t r = overall_result.get();

    std::cout << "Locality " << this_locality << " has value " << r
              << std::endl;
}
else
{
    hpx::future<void> overall_result =
        reduce_there(reduce_direct_client, std::move(value));
    overall_result.get();
}

This code will print the following message:

Locality 0 has value 1

HPX uses two functions to implement the functionality of MPI_Reduce: reduce_here and reduce_there. reduce_here is gathering data from all localities to the locality with ID 0 (root locality) and then performs the defined reduction operation. reduce_there allows non-root localities to participate in the reduction operation by sending data to the root locality. In more detail:

  • hpx::get_num_localities(hpx::launch::sync) retrieves the number of localities, while hpx::get_locality_id() returns the ID of the current locality.

  • The function create_communicator() is used to create a communicator called reduce_direct_client.

  • If the current locality is the root (its ID is equal to 0):

    • The reduce_here function initiates a reduction operation with addition (std::plus) as the reduction operator. The result is stored in overall_result.

    • The get() member function of the overall_result future is used to wait for the result.

  • If the current locality is not the root:

    • The reduce_there initiates a remote reduction operation.

    • The get() member function of the overall_result future is used to wait for the remote reduction operation to complete. This is done to ensure synchronization among localities.

Building tests and examples#

Tests#

To build the tests:

$ cmake --build . --target tests

To control which tests to run use ctest:

  • To run single tests, for example a test for for_loop:

$ ctest --output-on-failure -R tests.unit.modules.algorithms.algorithms.for_loop
  • To run a whole group of tests:

$ ctest --output-on-failure -R tests.unit
Examples#
  • To build (and install) all examples invoke:

$ cmake -DHPX_WITH_EXAMPLES=On .
$ make examples
$ make install
  • To build the hello_world_1 example run:

$ make hello_world_1

HPX executables end up in the bin directory in your build directory. You can now run hello_world_1 and should see the following output:

$ ./bin/hello_world_1
Hello World!

You’ve just run an example which prints Hello World! from the HPX runtime. The source for the example is in examples/quickstart/hello_world_1.cpp. The hello_world_distributed example (also available in the examples/quickstart directory) is a distributed hello world program, which is described in Remote execution with actions. It provides a gentle introduction to the distributed aspects of HPX.

Tip

Most build targets in HPX have two names: a simple name and a hierarchical name corresponding to what type of example or test the target is. If you are developing HPX it is often helpful to run make help to get a list of available targets. For example, make help | grep hello_world outputs the following:

... examples.quickstart.hello_world_2
... hello_world_2
... examples.quickstart.hello_world_1
... hello_world_1
... examples.quickstart.hello_world_distributed
... hello_world_distributed

It is also possible to build, for instance, all quickstart examples using make examples.quickstart.

Creating HPX projects#

Using HPX with pkg-config#
How to build HPX applications with pkg-config#

After you are done installing HPX, you should be able to build the following program. It prints Hello World! on the locality you run it on.

// Including 'hpx/hpx_main.hpp' instead of the usual 'hpx/hpx_init.hpp' enables
// to use the plain C-main below as the direct main HPX entry point.
#include <hpx/hpx_main.hpp>
#include <hpx/iostream.hpp>

int main()
{
    // Say hello to the world!
    hpx::cout << "Hello World!\n" << std::flush;
    return 0;
}

Copy the text of this program into a file called hello_world.cpp.

Now, in the directory where you put hello_world.cpp, issue the following commands (where $HPX_LOCATION is the build directory or CMAKE_INSTALL_PREFIX you used while building HPX):

$ export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:$HPX_LOCATION/lib/pkgconfig
$ c++ -o hello_world hello_world.cpp \
   `pkg-config --cflags --libs hpx_application`\
    -lhpx_iostreams -DHPX_APPLICATION_NAME=hello_world

Important

When using pkg-config with HPX, the pkg-config flags must go after the -o flag.

Note

HPX libraries have different names in debug and release mode. If you want to link against a debug HPX library, you need to use the _debug suffix for the pkg-config name. That means instead of hpx_application or hpx_component, you will have to use hpx_application_debug or hpx_component_debug Moreover, all referenced HPX components need to have an appended d suffix. For example, instead of -lhpx_iostreams you will need to specify -lhpx_iostreamsd.

Important

If the HPX libraries are in a path that is not found by the dynamic linker, you will need to add the path $HPX_LOCATION/lib to your linker search path (for example LD_LIBRARY_PATH on Linux).

To test the program, type:

$ ./hello_world

which should print Hello World! and exit.

How to build HPX components with pkg-config#

Let’s try a more complex example involving an HPX component. An HPX component is a class that exposes HPX actions. HPX components are compiled into dynamically loaded modules called component libraries. Here’s the source code:

hello_world_component.cpp

#include <hpx/config.hpp>
#if !defined(HPX_COMPUTE_DEVICE_CODE)
#include <hpx/iostream.hpp>
#include "hello_world_component.hpp"

#include <iostream>

namespace examples { namespace server {
    void hello_world::invoke()
    {
        hpx::cout << "Hello HPX World!" << std::endl;
    }
}}    // namespace examples::server

HPX_REGISTER_COMPONENT_MODULE()

typedef hpx::components::component<examples::server::hello_world>
    hello_world_type;

HPX_REGISTER_COMPONENT(hello_world_type, hello_world)

HPX_REGISTER_ACTION(
    examples::server::hello_world::invoke_action, hello_world_invoke_action)
#endif

hello_world_component.hpp

#pragma once

#include <hpx/config.hpp>
#if !defined(HPX_COMPUTE_DEVICE_CODE)
#include <hpx/hpx.hpp>
#include <hpx/include/actions.hpp>
#include <hpx/include/components.hpp>
#include <hpx/include/lcos.hpp>
#include <hpx/serialization.hpp>

#include <utility>

namespace examples { namespace server {
    struct HPX_COMPONENT_EXPORT hello_world
      : hpx::components::component_base<hello_world>
    {
        void invoke();
        HPX_DEFINE_COMPONENT_ACTION(hello_world, invoke)
    };
}}    // namespace examples::server

HPX_REGISTER_ACTION_DECLARATION(
    examples::server::hello_world::invoke_action, hello_world_invoke_action)

namespace examples {
    struct hello_world
      : hpx::components::client_base<hello_world, server::hello_world>
    {
        typedef hpx::components::client_base<hello_world, server::hello_world>
            base_type;

        hello_world(hpx::future<hpx::id_type>&& f)
          : base_type(std::move(f))
        {
        }

        hello_world(hpx::id_type&& f)
          : base_type(std::move(f))
        {
        }

        void invoke()
        {
            hpx::async<server::hello_world::invoke_action>(this->get_id())
                .get();
        }
    };
}    // namespace examples

#endif

hello_world_client.cpp

#include <hpx/config.hpp>
#if defined(HPX_COMPUTE_HOST_CODE)
#include <hpx/wrap_main.hpp>

#include "hello_world_component.hpp"

int main()
{
    {
        // Create a single instance of the component on this locality.
        examples::hello_world client =
            hpx::new_<examples::hello_world>(hpx::find_here());

        // Invoke the component's action, which will print "Hello World!".
        client.invoke();
    }

    return 0;
}
#endif

Copy the three source files above into three files (called hello_world_component.cpp, hello_world_component.hpp and hello_world_client.cpp, respectively).

Now, in the directory where you put the files, run the following command to build the component library. (where $HPX_LOCATION is the build directory or CMAKE_INSTALL_PREFIX you used while building HPX):

$ export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:$HPX_LOCATION/lib/pkgconfig
$ c++ -o libhpx_hello_world.so hello_world_component.cpp \
   `pkg-config --cflags --libs hpx_component` \
    -lhpx_iostreams -DHPX_COMPONENT_NAME=hpx_hello_world

Now pick a directory in which to install your HPX component libraries. For this example, we’ll choose a directory named my_hpx_libs:

$ mkdir ~/my_hpx_libs
$ mv libhpx_hello_world.so ~/my_hpx_libs

Note

HPX libraries have different names in debug and release mode. If you want to link against a debug HPX library, you need to use the _debug suffix for the pkg-config name. That means instead of hpx_application or hpx_component you will have to use hpx_application_debug or hpx_component_debug. Moreover, all referenced HPX components need to have a appended d suffix, e.g. instead of -lhpx_iostreams you will need to specify -lhpx_iostreamsd.

Important

If the HPX libraries are in a path that is not found by the dynamic linker. You need to add the path $HPX_LOCATION/lib to your linker search path (for example LD_LIBRARY_PATH on Linux).

Now, to build the application that uses this component (hello_world_client.cpp), we do:

$ export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:$HPX_LOCATION/lib/pkgconfig
$ c++ -o hello_world_client hello_world_client.cpp \
   ``pkg-config --cflags --libs hpx_application``\
    -L${HOME}/my_hpx_libs -lhpx_hello_world -lhpx_iostreams

Important

When using pkg-config with HPX, the pkg-config flags must go after the -o flag.

Finally, you’ll need to set your LD_LIBRARY_PATH before you can run the program. To run the program, type:

$ export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$HOME/my_hpx_libs"
$ ./hello_world_client

which should print Hello HPX World! and exit.

Using HPX with CMake-based projects#

In addition to the pkg-config support discussed on the previous pages, HPX comes with full CMake support. In order to integrate HPX into existing or new CMakeLists.txt, you can leverage the find_package command integrated into CMake. Following, is a Hello World component example using CMake.

Let’s revisit what we have. We have three files that compose our example application:

  • hello_world_component.hpp

  • hello_world_component.cpp

  • hello_world_client.hpp

The basic structure to include HPX into your CMakeLists.txt is shown here:

# Require a recent version of cmake
cmake_minimum_required(VERSION 3.18 FATAL_ERROR)

# This project is C++ based.
project(your_app CXX)

# Instruct cmake to find the HPX settings
find_package(HPX)

In order to have CMake find HPX, it needs to be told where to look for the HPXConfig.cmake file that is generated when HPX is built or installed. It is used by find_package(HPX) to set up all the necessary macros needed to use HPX in your project. The ways to achieve this are:

  • Set the HPX_DIR CMake variable to point to the directory containing the HPXConfig.cmake script on the command line when you invoke CMake:

    $ cmake -DHPX_DIR=$HPX_LOCATION/lib/cmake/HPX ...
    

    where $HPX_LOCATION is the build directory or CMAKE_INSTALL_PREFIX you used when building/configuring HPX.

  • Set the CMAKE_PREFIX_PATH variable to the root directory of your HPX build or install location on the command line when you invoke CMake:

    $ cmake -DCMAKE_PREFIX_PATH=$HPX_LOCATION ...
    

    The difference between CMAKE_PREFIX_PATH and HPX_DIR is that CMake will add common postfixes, such as lib/cmake/<project, to the CMAKE_PREFIX_PATH and search in these locations too. Note that if your project uses HPX as well as other CMake-managed projects, the paths to the locations of these multiple projects may be concatenated in the CMAKE_PREFIX_PATH.

  • The variables above may be set in the CMake GUI or curses ccmake interface instead of the command line.

Additionally, if you wish to require HPX for your project, replace the find_package(HPX) line with find_package(HPX REQUIRED).

You can check if HPX was successfully found with the HPX_FOUND CMake variable.

Using CMake targets#

The recommended way of setting up your targets to use HPX is to link to the HPX::hpx CMake target:

target_link_libraries(hello_world_component PUBLIC HPX::hpx)

This requires that you have already created the target like this:

add_library(hello_world_component SHARED hello_world_component.cpp)
target_include_directories(hello_world_component PUBLIC ${CMAKE_CURRENT_SOURCE_DIR})

When you link your library to the HPX::hpx CMake target, you will be able use HPX functionality in your library. To use main() as the implicit entry point in your application you must additionally link your application to the CMake target HPX::wrap_main. This target is automatically linked to executables if you are using the macros described below (Using macros to create new targets). See Re-use the main() function as the main HPX entry point for more information on implicitly using main() as the entry point. If you want the same wrapping behavior without including hpx/hpx_main.hpp, link to the HPX::auto_wrap_main target instead. This enables the runtime initialization around main() unconditionally and is useful for codebases where adding the header to main.cpp is impractical

Note

The use of HPX::auto_wrap_main is not supported when using the native Windows MSVC toolchain.

If you want to use the facilities exposed by hpx::runtime_manager in binaries that were not linked as executables (e.g., in shared libraries), you will need make your cmake target explicitly depend on the HPX::init target:

add_library(hello_world_component SHARED hello_world_component.cpp)
target_link_libraries(hello_world_component PRIVATE HPX::init)

Otherwise you may see compilation errors complaining about the header file hpx/runtime_manager.hpp not being found.

Creating a component requires setting two additional compile definitions:

target_compile_options(hello_world_component
  HPX_COMPONENT_NAME=hello_world
  HPX_COMPONENT_EXPORTS)

Instead of setting these definitions manually you may link to the HPX::component target, which sets HPX_COMPONENT_NAME to hpx_<target_name>, where <target_name> is the target name of your library. Note that these definitions should be PRIVATE to make sure these definitions are not propagated transitively to dependent targets.

In addition to making your library a component you can make it a plugin. To do so link to the HPX::plugin target. Similarly to HPX::component this will set HPX_PLUGIN_NAME to hpx_<target_name>. This definition should also be PRIVATE. Unlike regular shared libraries, plugins are loaded at runtime from certain directories and will not be found without additional configuration. Plugins should be installed into a directory containing only plugins. For example, the plugins created by HPX itself are installed into the hpx subdirectory in the library install directory (typically lib or lib64). When using the HPX::plugin target you need to install your plugins into an appropriate directory. You may also want to set the location of your plugin in the build directory with the *_OUTPUT_DIRECTORY* CMake target properties to be able to load the plugins in the build directory. Once you’ve set the install or output directory of your plugin you need to tell your executable where to find it at runtime. You can do this either by setting the environment variable HPX_COMPONENT_PATHS or the ini setting hpx.component_paths (see --hpx:ini) to the directory containing your plugin.

Using macros to create new targets#

In addition to the targets described above, HPX provides convenience macros to hide optional boilerplate code that may be useful for your project. The link to the targets described above. We recommend that you use the targets directly whenever possible as they tend to compose better with other targets.

The macro for adding an HPX component is add_hpx_component. It can be used in your CMakeLists.txt file like this:

# build your application using HPX
add_hpx_component(hello_world
    SOURCES hello_world_component.cpp
    HEADERS hello_world_component.hpp
    COMPONENT_DEPENDENCIES iostreams)

Note

add_hpx_component adds a _component suffix to the target name. In the example above, a hello_world_component target will be created.

The available options to add_hpx_component are:

  • SOURCES: The source files for that component

  • HEADERS: The header files for that component

  • DEPENDENCIES: Other libraries or targets this component depends on

  • COMPONENT_DEPENDENCIES: The components this component depends on

  • PLUGIN: Treats this component as a plugin-able library

  • COMPILE_FLAGS: Additional compiler flags

  • LINK_FLAGS: Additional linker flags

  • FOLDER: Adds the headers and source files to this Source Group folder

  • EXCLUDE_FROM_ALL: Do not build this component as part of the all target

After adding the component, the way you add the executable is as follows:

# build your application using HPX
add_hpx_executable(hello_world
    SOURCES hello_world_client.cpp
    COMPONENT_DEPENDENCIES hello_world)

Note

add_hpx_executable automatically adds a _component suffix to dependencies specified in COMPONENT_DEPENDENCIES, meaning you can directly use the name given when adding a component using add_hpx_component.

When you configure your application, all you need to do is set the HPX_DIR variable to point to the installation of HPX.

Note

All library targets built with HPX are exported and readily available to be used as arguments to target_link_libraries in your targets. The HPX include directories are available with the HPX_INCLUDE_DIRS CMake variable.

Using the HPX compiler wrapper hpxcxx#

The hpxcxx compiler wrapper helps to compile a HPX component, application, or object file, based on the arguments passed to it.

$ hpxcxx [--exe=<APPLICATION_NAME> | --comp=<COMPONENT_NAME> | -c] FLAGS FILES

The hpxcxx command requires that either an application or a component is built or -c flag is specified. If the build is against a debug build, the -g is to be specified while building.

Optional FLAGS#
  • -l <LIBRARY> | -l<LIBRARY>: Links <LIBRARY> to the build

  • -g: Specifies that the application or component build is against a debug build

  • -rd: Sets release-with-debug-info option

  • -mr: Sets minsize-release option

All other flags (like -o OUTPUT_FILE) are directly passed to the underlying C++ compiler.

Using macros to set up existing targets to use HPX#

In addition to the add_hpx_component and add_hpx_executable, you can use the hpx_setup_target macro to have an already existing target to be used with the HPX libraries:

hpx_setup_target(target)

Optional parameters are:

  • EXPORT: Adds it to the CMake export list HPXTargets

  • INSTALL: Generates an install rule for the target

  • PLUGIN: Treats this component as a plugin-able library

  • TYPE: The type can be: EXECUTABLE, LIBRARY or COMPONENT

  • DEPENDENCIES: Other libraries or targets this component depends on

  • COMPONENT_DEPENDENCIES: The components this component depends on

  • COMPILE_FLAGS: Additional compiler flags

  • LINK_FLAGS: Additional linker flags

If you do not use CMake, you can still build against HPX, but you should refer to the section on How to build HPX components with pkg-config.

Note

Since HPX relies on dynamic libraries, the dynamic linker needs to know where to look for them. If HPX isn’t installed into a path that is configured as a linker search path, external projects need to either set RPATH or adapt LD_LIBRARY_PATH to point to where the HPX libraries reside. In order to set RPATHs, you can include HPX_SetFullRPATH in your project after all libraries you want to link against have been added. Please also consult the CMake documentation here.

Using HPX with Makefile#

A basic project building with HPX is through creating makefiles. The process of creating one can get complex depending upon the use of cmake parameter HPX_WITH_HPX_MAIN (which defaults to ON).

How to build HPX applications with makefile#

If HPX is installed correctly, you should be able to build and run a simple Hello World program. It prints Hello World! on the locality you run it on.

// Including 'hpx/hpx_main.hpp' instead of the usual 'hpx/hpx_init.hpp' enables
// to use the plain C-main below as the direct main HPX entry point.
#include <hpx/hpx_main.hpp>
#include <hpx/iostream.hpp>

int main()
{
    // Say hello to the world!
    hpx::cout << "Hello World!\n" << std::flush;
    return 0;
}

Copy the content of this program into a file called hello_world.cpp.

Now, in the directory where you put hello_world.cpp, create a Makefile. Add the following code:

CXX=(CXX)  # Add your favourite compiler here or let makefile choose default.

CXXFLAGS=-O3 -std=c++17

Boost_ROOT=/path/to/boost
Hwloc_ROOT=/path/to/hwloc
Tcmalloc_ROOT=/path/to/tcmalloc
HPX_ROOT=/path/to/hpx

INCLUDE_DIRECTIVES=$(HPX_ROOT)/include $(Boost_ROOT)/include $(Hwloc_ROOT)/include

LIBRARY_DIRECTIVES=-L$(HPX_ROOT)/lib $(HPX_ROOT)/lib/libhpx_init.a $(HPX_ROOT)/lib/libhpx.so $(Boost_ROOT)/lib/libboost_atomic-mt.so $(Boost_ROOT)/lib/libboost_filesystem-mt.so $(Boost_ROOT)/lib/libboost_program_options-mt.so $(Boost_ROOT)/lib/libboost_regex-mt.so $(Boost_ROOT)/lib/libboost_system-mt.so -lpthread $(Tcmalloc_ROOT)/libtcmalloc_minimal.so $(Hwloc_ROOT)/libhwloc.so -ldl -lrt

LINK_FLAGS=$(HPX_ROOT)/lib/libhpx_wrap.a -Wl,-wrap=main  # should be left empty for HPX_WITH_HPX_MAIN=OFF

hello_world: hello_world.o
   $(CXX) $(CXXFLAGS) -o hello_world hello_world.o $(LIBRARY_DIRECTIVES) $(LINK_FLAGS)

hello_world.o:
   $(CXX) $(CXXFLAGS) -c -o hello_world.o hello_world.cpp $(INCLUDE_DIRECTIVES)

Important

LINK_FLAGS should be left empty if HPX_WITH_HPX_MAIN is set to OFF. Boost in the above example is build with --layout=tagged. Actual Boost flags may vary on your build of Boost.

To build the program, type:

$ make

A successful build should result in hello_world binary. To test, type:

$ ./hello_world
How to build HPX components with makefile#

Let’s try a more complex example involving an HPX component. An HPX component is a class that exposes HPX actions. HPX components are compiled into dynamically-loaded modules called component libraries. Here’s the source code:

hello_world_component.cpp

#include <hpx/config.hpp>
#if !defined(HPX_COMPUTE_DEVICE_CODE)
#include <hpx/iostream.hpp>
#include "hello_world_component.hpp"

#include <iostream>

namespace examples { namespace server {
    void hello_world::invoke()
    {
        hpx::cout << "Hello HPX World!" << std::endl;
    }
}}    // namespace examples::server

HPX_REGISTER_COMPONENT_MODULE()

typedef hpx::components::component<examples::server::hello_world>
    hello_world_type;

HPX_REGISTER_COMPONENT(hello_world_type, hello_world)

HPX_REGISTER_ACTION(
    examples::server::hello_world::invoke_action, hello_world_invoke_action)
#endif

hello_world_component.hpp

#pragma once

#include <hpx/config.hpp>
#if !defined(HPX_COMPUTE_DEVICE_CODE)
#include <hpx/hpx.hpp>
#include <hpx/include/actions.hpp>
#include <hpx/include/components.hpp>
#include <hpx/include/lcos.hpp>
#include <hpx/serialization.hpp>

#include <utility>

namespace examples { namespace server {
    struct HPX_COMPONENT_EXPORT hello_world
      : hpx::components::component_base<hello_world>
    {
        void invoke();
        HPX_DEFINE_COMPONENT_ACTION(hello_world, invoke)
    };
}}    // namespace examples::server

HPX_REGISTER_ACTION_DECLARATION(
    examples::server::hello_world::invoke_action, hello_world_invoke_action)

namespace examples {
    struct hello_world
      : hpx::components::client_base<hello_world, server::hello_world>
    {
        typedef hpx::components::client_base<hello_world, server::hello_world>
            base_type;

        hello_world(hpx::future<hpx::id_type>&& f)
          : base_type(std::move(f))
        {
        }

        hello_world(hpx::id_type&& f)
          : base_type(std::move(f))
        {
        }

        void invoke()
        {
            hpx::async<server::hello_world::invoke_action>(this->get_id())
                .get();
        }
    };
}    // namespace examples

#endif

hello_world_client.cpp

#include <hpx/config.hpp>
#if defined(HPX_COMPUTE_HOST_CODE)
#include <hpx/wrap_main.hpp>

#include "hello_world_component.hpp"

int main()
{
    {
        // Create a single instance of the component on this locality.
        examples::hello_world client =
            hpx::new_<examples::hello_world>(hpx::find_here());

        // Invoke the component's action, which will print "Hello World!".
        client.invoke();
    }

    return 0;
}
#endif

Now, in the directory, create a Makefile. Add the following code:

CXX=(CXX)  # Add your favourite compiler here or let makefile choose default.

CXXFLAGS=-O3 -std=c++17

Boost_ROOT=/path/to/boost
Hwloc_ROOT=/path/to/hwloc
Tcmalloc_ROOT=/path/to/tcmalloc
HPX_ROOT=/path/to/hpx

INCLUDE_DIRECTIVES=$(HPX_ROOT)/include $(Boost_ROOT)/include $(Hwloc_ROOT)/include

LIBRARY_DIRECTIVES=-L$(HPX_ROOT)/lib $(HPX_ROOT)/lib/libhpx_init.a $(HPX_ROOT)/lib/libhpx.so $(Boost_ROOT)/lib/libboost_atomic-mt.so $(Boost_ROOT)/lib/libboost_filesystem-mt.so $(Boost_ROOT)/lib/libboost_program_options-mt.so $(Boost_ROOT)/lib/libboost_regex-mt.so $(Boost_ROOT)/lib/libboost_system-mt.so -lpthread $(Tcmalloc_ROOT)/libtcmalloc_minimal.so $(Hwloc_ROOT)/libhwloc.so -ldl -lrt

LINK_FLAGS=$(HPX_ROOT)/lib/libhpx_wrap.a -Wl,-wrap=main  # should be left empty for HPX_WITH_HPX_MAIN=OFF

hello_world_client: libhpx_hello_world hello_world_client.o
  $(CXX) $(CXXFLAGS) -o hello_world_client $(LIBRARY_DIRECTIVES) libhpx_hello_world $(LINK_FLAGS)

hello_world_client.o: hello_world_client.cpp
  $(CXX) $(CXXFLAGS) -o hello_world_client.o hello_world_client.cpp $(INCLUDE_DIRECTIVES)

libhpx_hello_world: hello_world_component.o
  $(CXX) $(CXXFLAGS) -o libhpx_hello_world hello_world_component.o $(LIBRARY_DIRECTIVES)

hello_world_component.o: hello_world_component.cpp
  $(CXX) $(CXXFLAGS) -c -o hello_world_component.o hello_world_component.cpp $(INCLUDE_DIRECTIVES)

To build the program, type:

$ make

A successful build should result in hello_world binary. To test, type:

$ ./hello_world

Note

Due to high variations in CMake flags and library dependencies, it is recommended to build HPX applications and components with pkg-config or CMakeLists.txt. Writing Makefile may result in broken builds if due care is not taken. pkg-config files and CMake systems are configured with CMake build of HPX. Hence, they are stable when used together and provide better support overall.

Starting the HPX runtime#

In order to write an application that uses services from the HPX runtime system, you need to initialize the HPX library by inserting certain calls into the code of your application. Depending on your use case, this can be done in 3 different ways:

  • Minimally invasive: Re-use the main() function as the main HPX entry point.

  • Balanced use case: Supply your own main HPX entry point while blocking the main thread.

  • Most flexibility: Supply your own main HPX entry point while avoiding blocking the main thread.

  • Suspend and resume: As above but suspend and resume the HPX runtime to allow for other runtimes to be used.

Re-use the main() function as the main HPX entry point#

This method is the least intrusive to your code. However, it provides you with the smallest flexibility in terms of initializing the HPX runtime system. The following code snippet shows what a minimal HPX application using this technique looks like:

#include <hpx/hpx_main.hpp>

int main(int argc, char* argv[])
{
    return 0;
}

The only change to your code you have to make is to include the file hpx/hpx_main.hpp. In this case the function main() will be invoked as the first HPX thread of the application. The runtime system will be initialized behind the scenes before the function main() is executed and will automatically stop after main() has returned. For this method to work you must link your application to the CMake target HPX::wrap_main. This is done automatically if you are using the provided macros (Using macros to create new targets) to set up your application, but must be done explicitly if you are using targets directly (Using CMake targets). All HPX API functions can be used from within the main() function now. If you cannot or do not want to include hpx/hpx_main.hpp in main.cpp, you can instead link against HPX::auto_wrap_main. That target enables the same runtime startup path without needing the header-triggered opt-in.

Note

The use of HPX::auto_wrap_main is not supported when using the native Windows MSVC toolchain.

Note

The function main() does not need to expect receiving argc and argv as shown above, but could expose the signature int main(). This is consistent with the usually allowed prototypes for the function main() in C++ applications.

All command line arguments specific to HPX will still be processed by the HPX runtime system as usual. However, those command line options will be removed from the list of values passed to argc/argv of the function main(). The list of values passed to main() will hold only the commandline options that are not recognized by the HPX runtime system (see the section HPX Command Line Options for more details on what options are recognized by HPX).

Note

In this mode all one-letter shortcuts that are normally available on the HPX command line are disabled (such as -t or -l see HPX Command Line Options). This is done to minimize any possible interaction between the command line options recognized by the HPX runtime system and any command line options defined by the application.

The value returned from the function main() as shown above will be returned to the operating system as usual.

Important

To achieve this seamless integration, the header file hpx/hpx_main.hpp defines a macro:

#define main hpx_startup::user_main

which could result in unexpected behavior.

Important

To achieve this seamless integration, we use different implementations for different operating systems. In case of Linux or macOS, the code present in hpx_wrap.cpp is put into action. We hook into the system function in case of Linux and provide alternate entry point in case of macOS. For other operating systems we rely on a macro:

#define main hpx_startup::user_main

provided in the header file hpx/hpx_main.hpp. This implementation can result in unexpected behavior.

Caution

We make use of an override variable include_libhpx_wrap in the header file hpx/hpx_main.hpp to swiftly choose the function call stack at runtime. Therefore, the header file should only be included in the main executable. Including it in the components will result in multiple definition of the variable.

Supply your own main HPX entry point while blocking the main thread#

With this method you need to provide an explicit main-thread function named hpx_main at global scope. This function will be invoked as the main entry point of your HPX application on the console locality only (this function will be invoked as the first HPX thread of your application). All HPX API functions can be used from within this function.

The thread executing the function hpx::init will block waiting for the runtime system to exit. The value returned from hpx_main will be returned from hpx::init after the runtime system has stopped.

The function hpx::finalize has to be called on one of the HPX localities in order to signal that all work has been scheduled and the runtime system should be stopped after the scheduled work has been executed.

This method of invoking HPX has the advantage of the user being able to decide which version of hpx::init to call. This allows to pass additional configuration parameters while initializing the HPX runtime system.

#include <hpx/hpx_init.hpp>

int hpx_main(int argc, char* argv[])
{
    // Any HPX application logic goes here...
    return hpx::finalize();
}

int main(int argc, char* argv[])
{
    // Initialize HPX, run hpx_main as the first HPX thread, and
    // wait for hpx::finalize being called.
    return hpx::init(argc, argv);
}

Note

The function hpx_main does not need to expect receiving argc/argv as shown above, but could expose one of the following signatures:

int hpx_main();
int hpx_main(int argc, char* argv[]);
int hpx_main(hpx::program_options::variables_map& vm);

This is consistent with (and extends) the usually allowed prototypes for the function main() in C++ applications.

The header file to include for this method of using HPX is hpx/hpx_init.hpp.

There are many additional overloads of hpx::init available, such as the ability to provide your own entry-point function instead of hpx_main. Please refer to the function documentation for more details (see: hpx/hpx_init.hpp).

Supply your own main HPX entry point while avoiding blocking the main thread#

With this method you need to provide an explicit main thread function named hpx_main at global scope. This function will be invoked as the main entry point of your HPX application on the console locality only (this function will be invoked as the first HPX thread of your application). All HPX API functions can be used from within this function.

The thread executing the function hpx::start will not block waiting for the runtime system to exit, but will return immediately. The function hpx::finalize has to be called on one of the HPX localities in order to signal that all work has been scheduled and the runtime system should be stopped after the scheduled work has been executed.

This method of invoking HPX is useful for applications where the main thread is used for special operations, such a GUIs. The function hpx::stop can be used to wait for the HPX runtime system to exit and should at least be used as the last function called in main(). The value returned from hpx_main will be returned from hpx::stop after the runtime system has stopped.

#include <hpx/hpx_start.hpp>

int hpx_main(int argc, char* argv[])
{
    // Any HPX application logic goes here...
    return hpx::finalize();
}

int main(int argc, char* argv[])
{
    // Initialize HPX, run hpx_main.
    hpx::start(argc, argv);

    // ...Execute other code here...

    // Wait for hpx::finalize being called.
    return hpx::stop();
}

Note

The function hpx_main does not need to expect receiving argc/argv as shown above, but could expose one of the following signatures:

int hpx_main();
int hpx_main(int argc, char* argv[]);
int hpx_main(hpx::program_options::variables_map& vm);

This is consistent with (and extends) the usually allowed prototypes for the function main() in C++ applications.

The header file to include for this method of using HPX is hpx/hpx_start.hpp.

There are many additional overloads of hpx::start available, such as the option for users to provide their own entry point function instead of hpx_main. Please refer to the function documentation for more details (see: hpx/hpx_start.hpp).

Supply your own explicit startup function as the main HPX entry point#

There is also a way to specify any function (besides hpx_main) to be used as the main entry point for your HPX application:

#include <hpx/hpx_init.hpp>

int application_entry_point(int argc, char* argv[])
{
    // Any HPX application logic goes here...
    return hpx::finalize();
}

int main(int argc, char* argv[])
{
    // Initialize HPX, run application_entry_point as the first HPX thread,
    // and wait for hpx::finalize being called.
    return hpx::init(&application_entry_point, argc, argv);
}

Note

The function supplied to hpx::init must have one of the following prototypes:

int application_entry_point(int argc, char* argv[]); int application_entry_point(hpx::program_options::variables_map& vm);

Note

If nullptr is used as the function argument, HPX will not run any startup function on this locality.

Suspending and resuming the HPX runtime#

In some applications it is required to combine HPX with other runtimes. To support this use case, HPX provides two functions: hpx::suspend and hpx::resume. hpx::suspend is a blocking call which will wait for all scheduled tasks to finish executing and then put the thread pool OS threads to sleep. hpx::resume simply wakes up the sleeping threads so that they are ready to accept new work. hpx::suspend and hpx::resume can be found in the header hpx/hpx_suspend.hpp.

#include <hpx/hpx_start.hpp>
#include <hpx/hpx_suspend.hpp>

int main(int argc, char* argv[])
{

   // Initialize HPX, don't run hpx_main
    hpx::start(nullptr, argc, argv);

    // Schedule a function on the HPX runtime
    hpx::post(&my_function, ...);

    // Wait for all tasks to finish, and suspend the HPX runtime
    hpx::suspend();

    // Execute non-HPX code here

    // Resume the HPX runtime
    hpx::resume();

    // Schedule more work on the HPX runtime

    // hpx::finalize has to be called from the HPX runtime before hpx::stop
    hpx::post([]() { hpx::finalize(); });
    return hpx::stop();
}

Note

hpx::suspend does not wait for hpx::finalize to be called. Only call hpx::finalize when you wish to fully stop the HPX runtime.

Warning

hpx::suspend only waits for local tasks, i.e. tasks on the

current locality, to finish executing. When using hpx::suspend in a multi-locality scenario the user is responsible for ensuring that any work required from other localities has also finished.

HPX also supports suspending individual thread pools and threads. For details on how to do that, see the documentation for hpx::threads::thread_pool_base.

Automatically suspending worker threads#

The previous method guarantees that the worker threads are suspended when you ask for it and that they stay suspended. An alternative way to achieve the same effect is to tweak how quickly HPX suspends its worker threads when they run out of work. The following configuration values make sure that HPX idles very quickly:

hpx.max_idle_backoff_time = 1000
hpx.max_idle_loop_count = 0

They can be set on the command line using --hpx:ini=hpx.max_idle_backoff_time=1000 and --hpx:ini=hpx.max_idle_loop_count=0. See Launching and configuring HPX applications for more details on how to set configuration parameters.

After setting idling parameters the previous example could now be written like this instead:

#include <hpx/hpx_start.hpp>

int main(int argc, char* argv[])
{

   // Initialize HPX, don't run hpx_main
    hpx::start(nullptr, argc, argv);

    // Schedule some functions on the HPX runtime
    // NOTE: run_as_hpx_thread blocks until completion.
    hpx::run_as_hpx_thread(&my_function, ...);
    hpx::run_as_hpx_thread(&my_other_function, ...);

    // hpx::finalize has to be called from the HPX runtime before hpx::stop
    hpx::post([]() { hpx::finalize(); });
    return hpx::stop();
}

In this example each call to hpx::run_as_hpx_thread acts as a “parallel region”.

Working of hpx_main.hpp#

In order to initialize HPX from main(), we make use of linker tricks.

It is implemented differently for different operating systems. The method of implementation is as follows:

  • Linux: Using linker --wrap option.

  • Mac OSX: Using the linker -e option.

  • Windows: Using #define main hpx_startup::user_main

Linux implementation#

We make use of the Linux linker ld‘s --wrap option to wrap the main() function. This way any calls to main() are redirected to our own implementation of main. It is here that we check for the existence of hpx_main.hpp by making use of a shadow variable include_libhpx_wrap. The value of this variable determines the function stack at runtime.

The implementation can be found in libhpx_wrap.a.

Important

It is necessary that hpx_main.hpp be not included more than once. Multiple inclusions can result in multiple definition of include_libhpx_wrap.

Mac OSX implementation#

Here we make use of yet another linker option -e to change the entry point to our custom entry function initialize_main. We initialize the HPX runtime system from this function and call main from the initialized system. We determine the function stack at runtime by making use of the shadow variable include_libhpx_wrap.

The implementation can be found in libhpx_wrap.a.

Important

It is necessary that hpx_main.hpp be not included more than once. Multiple inclusions can result in multiple definition of include_libhpx_wrap.

Windows implementation#

We make use of a macro #define main hpx_startup::user_main to take care of the initializations.

This implementation could result in unexpected behaviors.

Launching and configuring HPX applications#

Configuring HPX applications#

All HPX applications can be configured using special command line options and/or using special configuration files. This section describes the available options, the configuration file format, and the algorithm used to locate possible predefined configuration files. Additionally, this section describes the defaults assumed if no external configuration information is supplied.

During startup any HPX application applies a predefined search pattern to locate one or more configuration files. All found files will be read and merged in the sequence they are found into one single internal database holding all configuration properties. This database is used during the execution of the application to configure different aspects of the runtime system.

In addition to the ini files, any application can supply its own configuration files, which will be merged with the configuration database as well. Moreover, the user can specify additional configuration parameters on the command line when executing an application. The HPX runtime system will merge all command line configuration options (see the description of the --hpx:ini, --hpx:config, and --hpx:app-config command line options).

The HPX ini file format#

All HPX applications can be configured using a special file format that is similar to the well-known Windows INI file format. This is a structured text format that allows users to group key/value pairs (properties) into sections. The basic element contained in an ini file is the property. Every property has a name and a value, delimited by an equal sign '='. The name appears to the left of the equal sign:

name=value

The value may contain equal signs as only the first '=' character is interpreted as the delimiter between name and value. Whitespace before the name, after the value and immediately before and after the delimiting equal sign is ignored. Whitespace inside the value is retained.

Properties may be grouped into arbitrarily named sections. The section name appears on a line by itself, in square brackets. All properties after the section declaration are associated with that section. There is no explicit “end of section” delimiter; sections end at the next section declaration or the end of the file:

[section]

In HPX sections can be nested. A nested section has a name composed of all section names it is embedded in. The section names are concatenated using a dot '.':

[outer_section.inner_section]

Here, inner_section is logically nested within outer_section.

It is possible to use the full section name concatenated with the property name to refer to a particular property. For example, in:

[a.b.c]
d = e

the property value of d can be referred to as a.b.c.d=e.

In HPX ini files can contain comments. Hash signs '#' at the beginning of a line indicate a comment. All characters starting with '#' until the end of the line are ignored.

If a property with the same name is reused inside a section, the second occurrence of this property name will override the first occurrence (discard the first value). Duplicate sections simply merge their properties together, as if they occurred contiguously.

In HPX ini files a property value ${FOO:default} will use the environmental variable FOO to extract the actual value if it is set and default otherwise. No default has to be specified. Therefore, ${FOO} refers to the environmental variable FOO. If FOO is not set or empty, the overall expression will evaluate to an empty string. A property value $[section.key:default] refers to the value held by the property section.key if it exists and default otherwise. No default has to be specified. Therefore $[section.key] refers to the property section.key. If the property section.key is not set or empty, the overall expression will evaluate to an empty string.

Note

Any property $[section.key:default] is evaluated whenever it is queried and not when the configuration data is initialized. This allows for lazy evaluation and relaxes initialization order of different sections. The only exception are recursive property values, e.g., values referring to the very key they are associated with. Those property values are evaluated at initialization time to avoid infinite recursion.

Built-in default configuration settings#

During startup any HPX application applies a predefined search pattern to locate one or more configuration files. All found files will be read and merged in the sequence they are found into one single internal data structure holding all configuration properties.

As a first step the internal configuration database is filled with a set of default configuration properties. Those settings are described on a section by section basis below.

Note

You can print the default configuration settings used for an executable by specifying the command line option --hpx:dump-config.

The system configuration section#
[system]
pid = <process-id>
prefix = <current prefix path of core HPX library>
executable = <current prefix path of executable>

Property

Description

system.pid

This is initialized to store the current OS-process id of the application instance.

system.prefix

This is initialized to the base directory HPX has been loaded from.

system.executable_prefix

This is initialized to the base directory the current executable has been loaded from.

The HPX configuration section#
[hpx]
location = ${HPX_LOCATION:$[system.prefix]}
component_path = $[hpx.location]/lib/hpx:$[system.executable_prefix]/lib/hpx:$[system.executable_prefix]/../lib/hpx
master_ini_path = $[hpx.location]/share/hpx-<version>:$[system.executable_prefix]/share/hpx-<version>:$[system.executable_prefix]/../share/hpx-<version>
ini_path = $[hpx.master_ini_path]/ini
os_threads = 1
cores = all
localities = 1
program_name =
cmd_line =
lock_detection = ${HPX_LOCK_DETECTION:0}
throw_on_held_lock = ${HPX_THROW_ON_HELD_LOCK:1}
minimal_deadlock_detection = <debug>
spinlock_deadlock_detection = <debug>
spinlock_deadlock_detection_limit = ${HPX_SPINLOCK_DEADLOCK_DETECTION_LIMIT:1000000}
max_background_threads = ${HPX_MAX_BACKGROUND_THREADS:$[hpx.os_threads]}
max_idle_loop_count = ${HPX_MAX_IDLE_LOOP_COUNT:<hpx_idle_loop_count_max>}
max_busy_loop_count = ${HPX_MAX_BUSY_LOOP_COUNT:<hpx_busy_loop_count_max>}
max_idle_backoff_time = ${HPX_MAX_IDLE_BACKOFF_TIME:<hpx_idle_backoff_time_max>}
exception_verbosity = ${HPX_EXCEPTION_VERBOSITY:2}
trace_depth = ${HPX_TRACE_DEPTH:20}
handle_signals = ${HPX_HANDLE_SIGNALS:1}
handle_failed_new = ${HPX_HANDLE_FAILED_NEW:1}

[hpx.stacks]
small_size = ${HPX_SMALL_STACK_SIZE:<hpx_small_stack_size>}
medium_size = ${HPX_MEDIUM_STACK_SIZE:<hpx_medium_stack_size>}
large_size = ${HPX_LARGE_STACK_SIZE:<hpx_large_stack_size>}
huge_size = ${HPX_HUGE_STACK_SIZE:<hpx_huge_stack_size>}
use_guard_pages = ${HPX_THREAD_GUARD_PAGE:1}

Property

Description

hpx.location

This is initialized to the id of the locality this application instance is running on.

hpx.component_path

Duplicates are discarded. This property can refer to a list of directories separated by ':' (Linux, Android, and MacOS) or by ';' (Windows).

hpx.master_ini_path

This is initialized to the list of default paths of the main hpx.ini configuration files. This property can refer to a list of directories separated by ':' (Linux, Android, and MacOS) or using ';' (Windows).

hpx.ini_path

This is initialized to the default path where HPX will look for more ini configuration files. This property can refer to a list of directories separated by ':' (Linux, Android, and MacOS) or using ';' (Windows).

hpx.os_threads

This setting reflects the number of OS threads used for running HPX threads. Defaults to number of detected cores (not hyperthreads/PUs).

hpx.cores

This setting reflects the number of cores used for running HPX threads. Defaults to number of detected cores (not hyperthreads/PUs).

hpx.localities

This setting reflects the number of localities the application is running on. Defaults to 1.

hpx.program_name

This setting reflects the program name of the application instance. Initialized from the command line argv[0].

hpx.cmd_line

This setting reflects the actual command line used to launch this application instance.

hpx.lock_detection

This setting verifies that no locks are being held while a HPX thread is suspended. This setting is applicable only if HPX_WITH_VERIFY_LOCKS is set during configuration in CMake.

hpx.throw_on_held_lock

This setting causes an exception if during lock detection at least one lock is being held while a HPX thread is suspended. This setting is applicable only if HPX_WITH_VERIFY_LOCKS is set during configuration in CMake. This setting has no effect if hpx.lock_detection=0.

hpx.minimal_deadlock_detection

This setting enables support for minimal deadlock detection for HPX threads. By default this is set to 1 (for Debug builds) or to 0 (for Release, RelWithDebInfo, RelMinSize builds). This setting is effective only if HPX_WITH_THREAD_DEADLOCK_DETECTION is set during configuration in CMake.

hpx.spinlock_deadlock_detection

This setting verifies that spinlocks don’t spin longer than specified using the hpx.spinlock_deadlock_detection_limit. This setting is applicable only if HPX_WITH_SPINLOCK_DEADLOCK_DETECTION is set during configuration in CMake. By default this is set to 1 (for Debug builds) or to 0 (for Release, RelWithDebInfo, RelMinSize builds).

hpx.spinlock_deadlock_detection_limit

This setting specifies the upper limit of the allowed number of spins that spinlocks are allowed to perform. This setting is applicable only if HPX_WITH_SPINLOCK_DEADLOCK_DETECTION is set during configuration in CMake. By default this is set to 1000000.

hpx.max_background_threads

This setting defines the number of threads in the scheduler, which are used to execute background work. By default this is the same as the number of cores used for the scheduler.

hpx.max_idle_loop_count

By default this is defined by the preprocessor constant HPX_IDLE_LOOP_COUNT_MAX. This is an internal setting that you should change only if you know exactly what you are doing.

hpx.max_busy_loop_count

This setting defines the maximum value of the busy-loop counter in the scheduler. By default this is defined by the preprocessor constant HPX_BUSY_LOOP_COUNT_MAX. This is an internal setting that you should change only if you know exactly what you are doing.

hpx.max_idle_backoff_time

This setting defines the maximum time (in milliseconds) for the scheduler to sleep after being idle for hpx.max_idle_loop_count iterations. This setting is applicable only if HPX_WITH_THREAD_MANAGER_IDLE_BACKOFF is set during configuration in CMake. By default this is defined by the preprocessor constant HPX_IDLE_BACKOFF_TIME_MAX. This is an internal setting that you should change only if you know exactly what you are doing.

hpx.exception_verbosity

This setting defines the verbosity of exceptions. Valid values are integers. A setting of 2 or higher prints all available information. A setting of 1 leaves out the build configuration and environment variables. A setting of 0 or lower prints only the description of the thrown exception and the file name, function, and line number where the exception was thrown. The default value is 2 or the value of the environment variable HPX_EXCEPTION_VERBOSITY.

hpx.trace_depth

This setting defines the number of stack-levels printed in generated stack backtraces. This defaults to 20, but can be changed using the cmake HPX_WITH_THREAD_BACKTRACE_DEPTH configuration setting.

hpx.handle_signals

This setting defines whether HPX will register signal handlers that will print the configuration information (stack backtrace, system information, etc.) whenever a signal is raised. The default is 1. Setting this value to 0 can be useful in cases when generating a core-dump on segmentation faults or similar signals is desired.

hpx.handle_failed_new

This setting defines whether HPX will register a handler for failed allocationsthat will print the configuration information (stack backtrace, system information, etc.) whenever an allocation fails. The default is 1. Setting this value to 0 can be useful in cases when generating a core-dump on segmentation faults or similar signals is desired.

hpx.stacks.small_size

This is initialized to the small stack size to be used by HPX threads. Set by default to the value of the compile time preprocessor constant HPX_SMALL_STACK_SIZE (defaults to 0x8000). This value is used for all HPX threads by default, except for the thread running hpx_main (which runs on a large stack).

hpx.stacks.medium_size

This is initialized to the medium stack size to be used by HPX threads. Set by default to the value of the compile time preprocessor constant HPX_MEDIUM_STACK_SIZE (defaults to 0x20000).

hpx.stacks.large_size

This is initialized to the large stack size to be used by HPX threads. Set by default to the value of the compile time preprocessor constant HPX_LARGE_STACK_SIZE (defaults to 0x200000). This setting is used by default for the thread running hpx_main only.

hpx.stacks.huge_size

This is initialized to the huge stack size to be used by HPX threads. Set by default to the value of the compile time preprocessor constant HPX_HUGE_STACK_SIZE (defaults to 0x2000000).

hpx.stacks.use_guard_pages

This entry controls whether the coroutine library will generate stack guard pages or not. This entry is applicable on Linux only and only if the HPX_USE_GENERIC_COROUTINE_CONTEXT option is not enabled and the HPX_WITH_THREAD_GUARD_PAGE is set to 1 while configuring the build system. It is set by default to 1.

The hpx.threadpools configuration section#
[hpx.threadpools]
io_pool_size = ${HPX_NUM_IO_POOL_SIZE:2}
parcel_pool_size = ${HPX_NUM_PARCEL_POOL_SIZE:2}
timer_pool_size = ${HPX_NUM_TIMER_POOL_SIZE:2}

Property

Description

hpx.threadpools.io_pool_size

The value of this property defines the number of OS threads created for the internal I/O thread pool.

hpx.threadpools.parcel_pool_size

The value of this property defines the number of OS threads created for the internal parcel thread pool.

hpx.threadpools.timer_pool_size

The value of this property defines the number of OS threads created for the internal timer thread pool.

The hpx.thread_queue configuration section#

Important

These are the setting control internal values used by the thread scheduling queues in the HPX scheduler. You should not modify these settings unless you know exactly what you are doing.

[hpx.thread_queue]
min_tasks_to_steal_pending = ${HPX_THREAD_QUEUE_MIN_TASKS_TO_STEAL_PENDING:0}
min_tasks_to_steal_staged = ${HPX_THREAD_QUEUE_MIN_TASKS_TO_STEAL_STAGED:0}
min_add_new_count = ${HPX_THREAD_QUEUE_MIN_ADD_NEW_COUNT:10}
max_add_new_count = ${HPX_THREAD_QUEUE_MAX_ADD_NEW_COUNT:10}
max_delete_count = ${HPX_THREAD_QUEUE_MAX_DELETE_COUNT:1000}

Property

Description

hpx.thread_queue.min_tasks_to_steal_pending

The value of this property defines the number of pending HPX threads that have to be available before neighboring cores are allowed to steal work. The default is to allow stealing always.

hpx.thread_queue.min_tasks_to_steal_staged

The value of this property defines the number of staged HPX tasks that need to be available before neighboring cores are allowed to steal work. The default is to allow stealing always.

hpx.thread_queue.min_add_new_count

The value of this property defines the minimal number of tasks to be converted into HPX threads whenever the thread queues for a core have run empty.

hpx.thread_queue.max_add_new_count

The value of this property defines the maximal number of tasks to be converted into HPX threads whenever the thread queues for a core have run empty.

hpx.thread_queue.max_delete_count

The value of this property defines the number of terminated HPX threads to discard during each invocation of the corresponding function.

The hpx.components configuration section#
[hpx.components]
load_external = ${HPX_LOAD_EXTERNAL_COMPONENTS:1}

Property

Description

hpx.components.load_external

This entry defines whether external components will be loaded on this locality. This entry is normally set to 1, and usually there is no need to directly change this value. It is automatically set to 0 for a dedicated AGAS server locality.

Additionally, the section hpx.components will be populated with the information gathered from all found components. The information loaded for each of the components will contain at least the following properties:

[hpx.components.<component_instance_name>]
name = <component_name>
path = <full_path_of_the_component_module>
enabled = $[hpx.components.load_external]

Property

Description

hpx.components.<component_instance_name>.name

This is the name of a component, usually the same as the second argument to the macro used while registering the component with HPX_REGISTER_COMPONENT. Set by the component factory.

hpx.components.<component_instance_name>.path

This is either the full path file name of the component module or the directory the component module is located in. In this case, the component module name will be derived from the property hpx.components.<component_instance_name>.name. Set by the component factory.

hpx.components.<component_instance_name>.enabled

This setting explicitly enables or disables the component. This is an optional property. HPX assumes that the component is enabled if it is not defined.

The value for <component_instance_name> is usually the same as for the corresponding name property. However, generally it can be defined to any arbitrary instance name. It is used to distinguish between different ini sections, one for each component.

The hpx.parcel configuration section#
[hpx.parcel]
address = ${HPX_PARCEL_SERVER_ADDRESS:<hpx_initial_ip_address>}
port = ${HPX_PARCEL_SERVER_PORT:<hpx_initial_ip_port>}
bootstrap = ${HPX_PARCEL_BOOTSTRAP:<hpx_parcel_bootstrap>}
max_connections = ${HPX_PARCEL_MAX_CONNECTIONS:<hpx_parcel_max_connections>}
max_connections_per_locality = ${HPX_PARCEL_MAX_CONNECTIONS_PER_LOCALITY:<hpx_parcel_max_connections_per_locality>}
max_message_size = ${HPX_PARCEL_MAX_MESSAGE_SIZE:<hpx_parcel_max_message_size>}
max_outbound_message_size = ${HPX_PARCEL_MAX_OUTBOUND_MESSAGE_SIZE:<hpx_parcel_max_outbound_message_size>}
array_optimization = ${HPX_PARCEL_ARRAY_OPTIMIZATION:1}
zero_copy_optimization = ${HPX_PARCEL_ZERO_COPY_OPTIMIZATION:$[hpx.parcel.array_optimization]}
zero_copy_receive_optimization = ${HPX_PARCEL_ZERO_COPY_RECEIVE_OPTIMIZATION:$[hpx.parcel.array_optimization]}
async_serialization = ${HPX_PARCEL_ASYNC_SERIALIZATION:1}
message_handlers = ${HPX_PARCEL_MESSAGE_HANDLERS:0}

Property

Description

hpx.parcel.address

This property defines the default IP address to be used for the parcel layer to listen to. This IP address will be used as long as no other values are specified (for instance, using the --hpx:hpx command line option). The expected format is any valid IP address or domain name format that can be resolved into an IP address. The default depends on the compile time preprocessor constant HPX_INITIAL_IP_ADDRESS ("127.0.0.1").

hpx.parcel.port

This property defines the default IP port to be used for the parcel layer to listen to. This IP port will be used as long as no other values are specified (for instance using the --hpx:hpx command line option). The default depends on the compile time preprocessor constant HPX_INITIAL_IP_PORT (7910).

hpx.parcel.bootstrap

This property defines which parcelport type should be used during application bootstrap. The default depends on the compile time preprocessor constant HPX_PARCEL_BOOTSTRAP ("tcp").

hpx.parcel.max_connections

This property defines how many network connections between different localities are overall kept alive by each locality. The default depends on the compile time preprocessor constant HPX_PARCEL_MAX_CONNECTIONS (512).

hpx.parcel.max_connections_per_locality

This property defines the maximum number of network connections that one locality will open to another locality. The default depends on the compile time preprocessor constant HPX_PARCEL_MAX_CONNECTIONS_PER_LOCALITY (4).

hpx.parcel.max_message_size

This property defines the maximum allowed message size that will be transferrable through the parcel layer. The default depends on the compile time preprocessor constant HPX_PARCEL_MAX_MESSAGE_SIZE (1000000000 bytes).

hpx.parcel.max_outbound_message_size

This property defines the maximum allowed outbound coalesced message size that will be transferrable through the parcel layer. The default depends on the compile time preprocessor constant HPX_PARCEL_MAX_OUTBOUND_MESSAGE_SIZE (1000000 bytes).

hpx.parcel.array_optimization

This property defines whether this locality is allowed to utilize array optimizations during serialization of parcel data. The default is 1.

hpx.parcel.zero_copy_optimization

This property defines whether this locality is allowed to utilize zero copy optimizations during serialization of parcel data. The default is the same value as set for hpx.parcel.array_optimization.

hpx.parcel.zero_copy_receive_optimization

This property defines whether this locality is allowed to utilize zero copy optimizations on the receiving end during de-serialization of parcel data. The default is the same value as set for hpx.parcel.zero_copy_optimization.

hpx.parcel.zero_copy_serialization_threshold

This property defines the threshold value (in bytes) starting at which the serialization layer will apply zero-copy optimizations for serialized entities. The default value is defined by the preprocessor constant HPX_ZERO_COPY_SERIALIZATION_THRESHOLD.

hpx.parcel.async_serialization

This property defines whether this locality is allowed to spawn a new thread for serialization (this is both for encoding and decoding parcels). The default is 1.

hpx.parcel.message_handlers

This property defines whether message handlers are loaded. The default is 0.

hpx.parcel.max_background_threads

This property defines how many cores should be used to perform background operations. The default is -1 (all cores).

The following settings relate to the TCP/IP parcelport.

[hpx.parcel.tcp]
enable = ${HPX_HAVE_PARCELPORT_TCP:$[hpx.parcel.enabled]}
array_optimization = ${HPX_PARCEL_TCP_ARRAY_OPTIMIZATION:$[hpx.parcel.array_optimization]}
zero_copy_optimization = ${HPX_PARCEL_TCP_ZERO_COPY_OPTIMIZATION:$[hpx.parcel.zero_copy_optimization]}
zero_copy_receive_optimization = ${HPX_PARCEL_TCP_ZERO_COPY_RECEIVE_OPTIMIZATION:$[hpx.parcel.zero_copy_receive_optimization]}
zero_copy_serialization_threshold =  ${HPX_PARCEL_TCP_ZERO_COPY_SERIALIZATION_THRESHOLD:$[hpx.parcel.zero_copy_serialization_threshold]}
async_serialization = ${HPX_PARCEL_TCP_ASYNC_SERIALIZATION:$[hpx.parcel.async_serialization]}
parcel_pool_size = ${HPX_PARCEL_TCP_PARCEL_POOL_SIZE:$[hpx.threadpools.parcel_pool_size]}
max_connections =  ${HPX_PARCEL_TCP_MAX_CONNECTIONS:$[hpx.parcel.max_connections]}
max_connections_per_locality = ${HPX_PARCEL_TCP_MAX_CONNECTIONS_PER_LOCALITY:$[hpx.parcel.max_connections_per_locality]}
max_message_size =  ${HPX_PARCEL_TCP_MAX_MESSAGE_SIZE:$[hpx.parcel.max_message_size]}
max_outbound_message_size =  ${HPX_PARCEL_TCP_MAX_OUTBOUND_MESSAGE_SIZE:$[hpx.parcel.max_outbound_message_size]}
max_background_threads =  ${HPX_PARCEL_TCP_MAX_BACKGROUND_THREADS:$[hpx.parcel.max_background_threads]}

Property

Description

hpx.parcel.tcp.enable

Enables the use of the default TCP parcelport. Note that the initial bootstrap of the overall HPX application will be performed using the default TCP connections. This parcelport is enabled by default. This will be disabled only if MPI is enabled (see below).

hpx.parcel.tcp.array_optimization

This property defines whether this locality is allowed to utilize array optimizations in the TCP/IP parcelport during serialization of parcel data. The default is the same value as set for hpx.parcel.array_optimization.

hpx.parcel.tcp.zero_copy_optimization

This property defines whether this locality is allowed to utilize zero copy optimizations during serialization of parcel data. The default is the same value as set for hpx.parcel.zero_copy_optimization.

hpx.parcel.tcp.zero_copy_receive_optimization

This property defines whether this locality is allowed to utilize zero copy optimizations on the receiving end in the TCP/IP parcelport during de-serialization of parcel data. The default is the same value as set for hpx.parcel.zero_copy_optimization.

hpx.parcel.tcp.zero_copy_serialization_threshold

This property defines the threshold value (in bytes) starting at which the serialization layer will apply zero-copy optimizations for serialized entities. The default is the same value as set for hpx.parcel.zero_copy_serialization_threshold.

hpx.parcel.tcp.async_serialization

This property defines whether this locality is allowed to spawn a new thread for serialization in the TCP/IP parcelport (this is both for encoding and decoding parcels). The default is the same value as set for hpx.parcel.async_serialization.

hpx.parcel.tcp.parcel_pool_size

The value of this property defines the number of OS threads created for the internal parcel thread pool of the TCP parcel port. The default is taken from hpx.threadpools.parcel_pool_size.

hpx.parcel.tcp.max_connections

This property defines how many network connections between different localities are overall kept alive by each locality. The default is taken from hpx.parcel.max_connections.

hpx.parcel.tcp.max_connections_per_locality

This property defines the maximum number of network connections that one locality will open to another locality. The default is taken from hpx.parcel.max_connections_per_locality.

hpx.parcel.tcp.max_message_size

This property defines the maximum allowed message size that will be transferrable through the parcel layer. The default is taken from hpx.parcel.max_message_size.

hpx.parcel.tcp.max_outbound_message_size

This property defines the maximum allowed outbound coalesced message size that will be transferrable through the parcel layer. The default is taken from hpx.parcel.max_outbound_connections.

hpx.parcel.tcp.max_background_threads

This property defines how many cores should be used to perform background operations. The default is taken from hpx.parcel.max_background_threads.

The following settings relate to the MPI parcelport. These settings take effect only if the compile time constant HPX_HAVE_PARCELPORT_MPI is set (the equivalent CMake variable is HPX_WITH_PARCELPORT_MPI and has to be set to ON).

[hpx.parcel.mpi]
enable = ${HPX_HAVE_PARCELPORT_MPI:$[hpx.parcel.enabled]}
env = ${HPX_HAVE_PARCELPORT_MPI_ENV:MV2_COMM_WORLD_RANK,PMI_RANK,OMPI_COMM_WORLD_SIZE,ALPS_APP_PE,PALS_NODEID}
multithreaded = ${HPX_HAVE_PARCELPORT_MPI_MULTITHREADED:1}
rank = <MPI_rank>
processor_name = <MPI_processor_name>
array_optimization = ${HPX_HAVE_PARCEL_MPI_ARRAY_OPTIMIZATION:$[hpx.parcel.array_optimization]}
zero_copy_optimization = ${HPX_HAVE_PARCEL_MPI_ZERO_COPY_OPTIMIZATION:$[hpx.parcel.zero_copy_optimization]}
zero_copy_receive_optimization = ${HPX_HAVE_PARCEL_MPI_ZERO_COPY_RECEIVE_OPTIMIZATION:$[hpx.parcel.zero_copy_receive_optimization]}
zero_copy_serialization_threshold =  ${HPX_PARCEL_MPI_ZERO_COPY_SERIALIZATION_THRESHOLD:$[hpx.parcel.zero_copy_serialization_threshold]}
use_io_pool = ${HPX_HAVE_PARCEL_MPI_USE_IO_POOL:$1}
async_serialization = ${HPX_HAVE_PARCEL_MPI_ASYNC_SERIALIZATION:$[hpx.parcel.async_serialization]}
parcel_pool_size = ${HPX_HAVE_PARCEL_MPI_PARCEL_POOL_SIZE:$[hpx.threadpools.parcel_pool_size]}
max_connections =  ${HPX_HAVE_PARCEL_MPI_MAX_CONNECTIONS:$[hpx.parcel.max_connections]}
max_connections_per_locality = ${HPX_HAVE_PARCEL_MPI_MAX_CONNECTIONS_PER_LOCALITY:$[hpx.parcel.max_connections_per_locality]}
max_message_size =  ${HPX_HAVE_PARCEL_MPI_MAX_MESSAGE_SIZE:$[hpx.parcel.max_message_size]}
max_outbound_message_size =  ${HPX_HAVE_PARCEL_MPI_MAX_OUTBOUND_MESSAGE_SIZE:$[hpx.parcel.max_outbound_message_size]}
max_background_threads =  ${HPX_PARCEL_MPI_MAX_BACKGROUND_THREADS:$[hpx.parcel.max_background_threads]}

Property

Description

hpx.parcel.mpi.enable

Enables the use of the MPI parcelport. HPX tries to detect if the application was started within a parallel MPI environment. If the detection was successful, the MPI parcelport is enabled by default. To explicitly disable the MPI parcelport, set to 0. Note that the initial bootstrap of the overall HPX application will be performed using MPI as well.

hpx.parcel.mpi.env

This property influences which environment variables (separated by commas) will be analyzed to find out whether the application was invoked by MPI.

hpx.parcel.mpi.multithreaded

This property is used to determine what threading mode to use when initializing MPI. If this setting is 0, HPX will initialize MPI with MPI_THREAD_SINGLE. If the value is not equal to 0, HPX will initialize MPI with MPI_THREAD_MULTI.

hpx.parcel.mpi.rank

This property will be initialized to the MPI rank of the locality.

hpx.parcel.mpi.processor_name

This property will be initialized to the MPI processor name of the locality.

hpx.parcel.mpi.array_optimization

This property defines whether this locality is allowed to utilize array optimizations in the MPI parcelport during serialization of parcel data. The default is the same value as set for hpx.parcel.array_optimization.

hpx.parcel.mpi.zero_copy_optimization

This property defines whether this locality is allowed to utilize zero copy optimizations in the MPI parcelport during serialization of parcel data. The default is the same value as set for hpx.parcel.zero_copy_optimization.

hpx.parcel.mpi.zero_copy_receive_optimization

This property defines whether this locality is allowed to utilize zero copy optimizations on the receiving end in the MPI parcelport during de-serialization of parcel data. The default is the same value as set for hpx.parcel.zero_copy_optimization.

hpx.parcel.mpi.zero_copy_serialization_threshold

This property defines the threshold value (in bytes) starting at which the serialization layer will apply zero-copy optimizations for serialized entities. The default is the same value as set for hpx.parcel.zero_copy_serialization_threshold.

hpx.parcel.mpi.use_io_pool

This property can be set to run the progress thread inside of HPX threads instead of a separate thread pool. The default is 1.

hpx.parcel.mpi.async_serialization

This property defines whether this locality is allowed to spawn a new thread for serialization in the MPI parcelport (this is both for encoding and decoding parcels). The default is the same value as set for hpx.parcel.async_serialization.

hpx.parcel.mpi.parcel_pool_size

The value of this property defines the number of OS threads created for the internal parcel thread pool of the MPI parcel port. The default is taken from hpx.threadpools.parcel_pool_size.

hpx.parcel.mpi.max_connections

This property defines how many network connections between different localities are overall kept alive by each locality. The default is taken from hpx.parcel.max_connections.

hpx.parcel.mpi.max_connections_per_locality

This property defines the maximum number of network connections that one locality will open to another locality. The default is taken from hpx.parcel.max_connections_per_locality.

hpx.parcel.mpi.max_message_size

This property defines the maximum allowed message size that will be transferrable through the parcel layer. The default is taken from hpx.parcel.max_message_size.

hpx.parcel.mpi.max_outbound_message_size

This property defines the maximum allowed outbound coalesced message size that will be transferrable through the parcel layer. The default is taken from hpx.parcel.max_outbound_connections.

hpx.parcel.mpi.max_background_threads

This property defines how many cores should be used to perform background operations. The default is taken from hpx.parcel.max_background_threads.

The hpx.agas configuration section#
[hpx.agas]
address = ${HPX_AGAS_SERVER_ADDRESS:<hpx_initial_ip_address>}
port = ${HPX_AGAS_SERVER_PORT:<hpx_initial_ip_port>}
service_mode = hosted
dedicated_server = 0
max_pending_refcnt_requests = ${HPX_AGAS_MAX_PENDING_REFCNT_REQUESTS:<hpx_initial_agas_max_pending_refcnt_requests>}
use_caching = ${HPX_AGAS_USE_CACHING:1}
use_range_caching = ${HPX_AGAS_USE_RANGE_CACHING:1}
local_cache_size = ${HPX_AGAS_LOCAL_CACHE_SIZE:<hpx_agas_local_cache_size>}

Property

Description

hpx.agas.address

This property defines the default IP address to be used for the AGAS root server. This IP address will be used as long as no other values are specified (for instance, using the --hpx:agas command line option). The expected format is any valid IP address or domain name format that can be resolved into an IP address. The default depends on the compile time preprocessor constant HPX_INITIAL_IP_ADDRESS ("127.0.0.1").

hpx.agas.port

This property defines the default IP port to be used for the AGAS root server. This IP port will be used as long as no other values are specified (for instance, using the --hpx:agas command line option). The default depends on the compile time preprocessor constant HPX_INITIAL_IP_PORT (7009).

hpx.agas.service_mode

This property specifies what type of AGAS service is running on this locality. Currently, two modes exist. The locality that acts as the AGAS server runs in bootstrap mode. All other localities are in hosted mode.

hpx.agas.dedicated_server

This property specifies whether the AGAS server is exclusively running AGAS services and not hosting any application components. It is a boolean value. Set to 1 if --hpx:run-agas-server-only is present.

hpx.agas.max_pending_refcnt_requests

This property defines the number of reference counting requests (increments or decrements) to buffer. The default depends on the compile time preprocessor constant HPX_INITIAL_AGAS_MAX_PENDING_REFCNT_REQUESTS (4096).

hpx.agas.use_caching

This property specifies whether a software address translation cache is used. It is a boolean value. Defaults to 1.

hpx.agas.use_range_caching

This property specifies whether range-based caching is used by the software address translation cache. This property is ignored if hpx.agas.use_caching is false. It is a boolean value. Defaults to 1.

hpx.agas.local_cache_size

This property defines the size of the software address translation cache for AGAS services. This property is ignored if hpx.agas.use_caching is false. Note that if hpx.agas.use_range_caching is true, this size will refer to the maximum number of ranges stored in the cache, not the number of entries spanned by the cache. The default depends on the compile time preprocessor constant HPX_AGAS_LOCAL_CACHE_SIZE (4096).

The hpx.commandline configuration section#

The following table lists the definition of all pre-defined command line option shortcuts. For more information about commandline options, see the section HPX Command Line Options.

[hpx.commandline]
aliasing = ${HPX_COMMANDLINE_ALIASING:1}
allow_unknown = ${HPX_COMMANDLINE_ALLOW_UNKNOWN:0}

[hpx.commandline.aliases]
-a = --hpx:agas
-c = --hpx:console
-h = --hpx:help
-I = --hpx:ini
-l = --hpx:localities
-p = --hpx:app-config
-q = --hpx:queuing
-r = --hpx:run-agas-server
-t = --hpx:threads
-v = --hpx:version
-w = --hpx:worker
-x = --hpx:hpx
-0 = --hpx:node=0
-1 = --hpx:node=1
-2 = --hpx:node=2
-3 = --hpx:node=3
-4 = --hpx:node=4
-5 = --hpx:node=5
-6 = --hpx:node=6
-7 = --hpx:node=7
-8 = --hpx:node=8
-9 = --hpx:node=9

Note

The short options listed above are disabled by default if the application is built using #include <hpx/hpx_main.hpp>. See Re-use the main() function as the main HPX entry point for more information. The rationale behind this is that in this case the user’s application may handle its own command line options, since HPX passes all unknown options to main(). Short options like -t are prone to create ambiguities regarding what the application will support. Hence, the user should instead rely on the corresponding long options like --hpx:threads in such a case.

Property

Description

hpx.commandline.aliasing

Enable command line aliases as defined in the section hpx.commandline.aliases (see below). Defaults to 1.

hpx.commandline.allow_unknown

Allow for unknown command line options to be passed through to hpx_main() Defaults to 0.

hpx.commandline.aliases.-a

On the commandline -a expands to: --hpx:agas.

hpx.commandline.aliases.-c

On the commandline -c expands to: --hpx:console.

hpx.commandline.aliases.-h

On the commandline -h expands to: --hpx:help.

hpx.commandline.aliases.--help

On the commandline --help expands to: --hpx:help.

hpx.commandline.aliases.-I

On the commandline -I expands to: --hpx:ini.

hpx.commandline.aliases.-l

On the commandline -l expands to: --hpx:localities.

hpx.commandline.aliases.-p

On the commandline -p expands to: --hpx:app-config.

hpx.commandline.aliases.-q

On the commandline -q expands to: --hpx:queuing.

hpx.commandline.aliases.-r

On the commandline -r expands to: --hpx:run-agas-server.

hpx.commandline.aliases.-t

On the commandline -t expands to: --hpx:threads.

hpx.commandline.aliases.-v

On the commandline -v expands to: --hpx:version.

hpx.commandline.aliases.--version

On the commandline --version expands to: --hpx:version.

hpx.commandline.aliases.-w

On the commandline -w expands to: --hpx:worker.

hpx.commandline.aliases.-x

On the commandline -x expands to: --hpx:hpx.

hpx.commandline.aliases.-0

On the commandline -0 expands to: --hpx:node=0.

hpx.commandline.aliases.-1

On the commandline -1 expands to: --hpx:node=1.

hpx.commandline.aliases.-2

On the commandline -2 expands to: --hpx:node=2.

hpx.commandline.aliases.-3

On the commandline -3 expands to: --hpx:node=3.

hpx.commandline.aliases.-4

On the commandline -4 expands to: --hpx:node=4.

hpx.commandline.aliases.-5

On the commandline -5 expands to: --hpx:node=5.

hpx.commandline.aliases.-6

On the commandline -6 expands to: --hpx:node=6.

hpx.commandline.aliases.-7

On the commandline -7 expands to: --hpx:node=7.

hpx.commandline.aliases.-8

On the commandline -8 expands to: --hpx:node=8.

hpx.commandline.aliases.-9

On the commandline -9 expands to: --hpx:node=9.

Loading INI files#

During startup and after the internal database has been initialized as described in the section Built-in default configuration settings, HPX will try to locate and load additional ini files to be used as a source for configuration properties. This allows for a wide spectrum of additional customization possibilities by the user and system administrators. The sequence of locations where HPX will try loading the ini files is well defined and documented in this section. All ini files found are merged into the internal configuration database. The merge operation itself conforms to the rules as described in the section The HPX ini file format.

  1. Load all component shared libraries found in the directories specified by the property hpx.component_path and retrieve their default configuration information (see section Loading components for more details). This property can refer to a list of directories separated by ':' (Linux, Android, and MacOS) or by ';' (Windows).

  2. Load all files named hpx.ini in the directories referenced by the property hpx.master_ini_path This property can refer to a list of directories separated by ':' (Linux, Android, and MacOS) or by ';' (Windows).

  3. Load a file named .hpx.ini in the current working directory, e.g., the directory the application was invoked from.

  4. Load a file referenced by the environment variable HPX_INI. This variable is expected to provide the full path name of the ini configuration file (if any).

  5. Load a file named /etc/hpx.ini. This lookup is done on non-Windows systems only.

  6. Load a file named .hpx.ini in the home directory of the current user, e.g., the directory referenced by the environment variable HOME.

  7. Load a file named .hpx.ini in the directory referenced by the environment variable PWD.

  8. Load the file specified on the command line using the option --hpx:config.

  9. Load all properties specified on the command line using the option --hpx:ini. The properties will be added to the database in the same sequence as they are specified on the command line. The format for those options is, for instance, --hpx:ini=hpx.default_stack_size=0x4000. In addition to the explicit command line options, this will set the following properties as implied from other settings:

  10. Load files based on the pattern *.ini in all directories listed by the property hpx.ini_path. All files found during this search will be merged. The property hpx.ini_path can hold a list of directories separated by ':' (on Linux or Mac) or ';' (on Windows).

  11. Load the file specified on the command line using the option --hpx:app-config. Note that this file will be merged as the content for a top level section [application].

Note

Any changes made to the configuration database caused by one of the steps will influence the loading process for all subsequent steps. For instance, if one of the ini files loaded changes the property hpx.ini_path, this will influence the directories searched in step 9 as described above.

Important

The HPX core library will verify that all configuration settings specified on the command line (using the --hpx:ini option) will be checked for validity. That means that the library will accept only known configuration settings. This is to protect the user from unintentional typos while specifying those settings. This behavior can be overwritten by appending a '!' to the configuration key, thus forcing the setting to be entered into the configuration database. For instance: --hpx:ini=hpx.foo! = 1

If any of the environment variables or files listed above are not found, the corresponding loading step will be silently skipped.

Loading components#

HPX relies on loading application specific components during the runtime of an application. Moreover, HPX comes with a set of preinstalled components supporting basic functionalities useful for almost every application. Any component in HPX is loaded from a shared library, where any of the shared libraries can contain more than one component type. During startup, HPX tries to locate all available components (e.g., their corresponding shared libraries) and creates an internal component registry for later use. This section describes the algorithm used by HPX to locate all relevant shared libraries on a system. As described, this algorithm is customizable by the configuration properties loaded from the ini files (see section Loading INI files).

Loading components is a two-stage process. First HPX tries to locate all component shared libraries, loads those, and generates a default configuration section in the internal configuration database for each component found. For each found component the following information is generated:

[hpx.components.<component_instance_name>]
name = <name_of_shared_library>
path = $[component_path]
enabled = $[hpx.components.load_external]
default = 1

The values in this section correspond to the expected configuration information for a component as described in the section Built-in default configuration settings.

In order to locate component shared libraries, HPX will try loading all shared libraries (files with the platform specific extension of a shared library, Linux: *.so, Windows: *.dll, MacOS: *.dylib found in the directory referenced by the ini property hpx.component_path).

This first step corresponds to step 1) during the process of filling the internal configuration database with default information as described in section Loading INI files.

After all of the configuration information has been loaded, HPX performs the second step in terms of loading components. During this step, HPX scans all existing configuration sections [hpx.component.<some_component_instance_name>] and instantiates a special factory object for each of the successfully located and loaded components. During the application’s life time, these factory objects are responsible for creating new and discarding old instances of the component they are associated with. This step is performed after step 11) of the process of filling the internal configuration database with default information as described in section Loading INI files.

Application specific component example#

This section assumes there is a simple application component that exposes one member function as a component action. The header file app_server.hpp declares the C++ type to be exposed as a component. This type has a member function print_greeting(), which is exposed as an action print_greeting_action. We assume the source files for this example are located in a directory referenced by $APP_ROOT:

// file: $APP_ROOT/app_server.hpp
#include <hpx/hpx.hpp>
#include <hpx/include/iostreams.hpp>

namespace app
{
    // Define a simple component exposing one action 'print_greeting'
    class HPX_COMPONENT_EXPORT server
      : public hpx::components::component_base<server>
    {
        void print_greeting ()
        {
            hpx::cout << "Hey, how are you?\n" << std::flush;
        }

        // Component actions need to be declared, this also defines the
        // type 'print_greeting_action' representing the action.
        HPX_DEFINE_COMPONENT_ACTION(server, print_greeting, print_greeting_action);
    };
}

// Declare boilerplate code required for each of the component actions.
HPX_REGISTER_ACTION_DECLARATION(app::server::print_greeting_action);

The corresponding source file contains mainly macro invocations that define the boilerplate code needed for HPX to function properly:

// file: $APP_ROOT/app_server.cpp
#include "app_server.hpp"

// Define boilerplate required once per component module.
HPX_REGISTER_COMPONENT_MODULE();

// Define factory object associated with our component of type 'app::server'.
HPX_REGISTER_COMPONENT(app::server, app_server);

// Define boilerplate code required for each of the component actions. Use the
// same argument as used for HPX_REGISTER_ACTION_DECLARATION above.
HPX_REGISTER_ACTION(app::server::print_greeting_action);

The following gives an example of how the component can be used. Here, one instance of the app::server component is created on the current locality and the exposed action print_greeting_action is invoked using the global id of the newly created instance. Note that no special code is required to delete the component instance after it is not needed anymore. It will be deleted automatically when its last reference goes out of scope (shown in the example below at the closing brace of the block surrounding the code):

// file: $APP_ROOT/use_app_server_example.cpp
#include <hpx/hpx_init.hpp>
#include "app_server.hpp"

int hpx_main()
{
    {
        // Create an instance of the app_server component on the current locality.
        hpx::naming:id_type app_server_instance =
            hpx::create_component<app::server>(hpx::find_here());

        // Create an instance of the action 'print_greeting_action'.
        app::server::print_greeting_action print_greeting;

        // Invoke the action 'print_greeting' on the newly created component.
        print_greeting(app_server_instance);
    }
    return hpx::finalize();
}

int main(int argc, char* argv[])
{
    return hpx::init(argc, argv);
}

In order to make sure that the application will be able to use the component app::server, special configuration information must be passed to HPX. The simplest way to allow HPX to ‘find’ the component is to provide special ini configuration files that add the necessary information to the internal configuration database. The component should have a special ini file containing the information specific to the component app_server.

# file: $APP_ROOT/app_server.ini
[hpx.components.app_server]
name = app_server
path = $APP_LOCATION/

Here, $APP_LOCATION is the directory where the (binary) component shared library is located. HPX will attempt to load the shared library from there. The section name hpx.components.app_server reflects the instance name of the component (app_server is an arbitrary, but unique name). The property value for hpx.components.app_server.name should be the same as used for the second argument to the macro HPX_REGISTER_COMPONENT above.

Additionally, a file .hpx.ini, which could be located in the current working directory (see step 3 as described in the section Loading INI files), can be used to add to the ini search path for components:

# file: $PWD/.hpx.ini
[hpx]
ini_path = $[hpx.ini_path]:$APP_ROOT/

This assumes that the above ini file specific to the component is located in the directory $APP_ROOT.

Note

It is possible to reference the defined property from inside its value. HPX will gracefully use the previous value of hpx.ini_path for the reference on the right hand side and assign the overall (now expanded) value to the property.

Logging#

HPX uses a sophisticated logging framework, allowing users to follow in detail what operations have been performed inside the HPX library in what sequence. This information proves to be very useful for diagnosing problems or just for improving the understanding of what is happening in HPX as a consequence of invoking HPX API functionality.

Default logging#

Enabling default logging is a simple process. The detailed description in the remainder of this section explains different ways to customize the defaults. Default logging can be enabled by using one of the following:

  • A command line switch --hpx:debug-hpx-log, which will enable logging to the console terminal.

  • The command line switch --hpx:debug-hpx-log=<filename>, which enables logging to a given file <filename>.

  • Setting an environment variable HPX_LOGLEVEL=<loglevel> while running the HPX application. In this case <loglevel> should be a number between (or equal to) 1 and 5 where 1 means minimal logging and 5 causes all available messages to be logged. When setting the environment variable, the logs will be written to a file named hpx.<PID>.lo in the current working directory, where <PID> is the process id of the console instance of the application.

Customizing logging#

Generally, logging can be customized either using environment variable settings or using by an ini configuration file. Logging is generated in several categories, each of which can be customized independently. All customizable configuration parameters have reasonable defaults, allowing for the use of logging without any additional configuration effort. The following table lists the available categories.

Table 5 Logging categories#

Category

Category shortcut

Information to be generated

Environment variable

General

None

Logging information generated by different subsystems of HPX, such as thread-manager, parcel layer, LCOs, etc.

HPX_LOGLEVEL

AGAS

AGAS

Logging output generated by the AGAS subsystem

HPX_AGAS_LOGLEVEL

Application

APP

Logging generated by applications.

HPX_APP_LOGLEVEL

By default, all logging output is redirected to the console instance of an application, where it is collected and written to a file, one file for each logging category.

Each logging category can be customized at two levels. The parameters for each are stored in the ini configuration sections hpx.logging.CATEGORY and hpx.logging.console.CATEGORY (where CATEGORY is the category shortcut as listed in the table above). The former influences logging at the source locality and the latter modifies the logging behaviour for each of the categories at the console instance of an application.

Levels#

All HPX logging output has seven different logging levels. These levels can be set explicitly or through environment variables in the main HPX ini file as shown below. The logging levels and their associated integral values are shown in the table below, ordered from most verbose to least verbose. By default, all HPX logs are set to 0, e.g., all logging output is disabled by default.

Table 6 Logging levels#

Logging level

Integral value

<debug>

5

<info>

4

<warning>

3

<error>

2

<fatal>

1

No logging

0

Tip

The easiest way to enable logging output is to set the environment variable corresponding to the logging category to an integral value as described in the table above. For instance, setting HPX_LOGLEVEL=5 will enable full logging output for the general category. Please note that the syntax and means of setting environment variables varies between operating systems.

Configuration#

Logs will be saved to destinations as configured by the user. By default, logging output is saved on the console instance of an application to hpx.<CATEGORY>.<PID>.lo (where CATEGORY and PID> are placeholders for the category shortcut and the OS process id). The output for the general logging category is saved to hpx.<PID>.log. The default settings for the general logging category are shown here (the syntax is described in the section The HPX ini file format):

[hpx.logging]
level = ${HPX_LOGLEVEL:0}
destination = ${HPX_LOGDESTINATION:console}
format = ${HPX_LOGFORMAT:(T%locality%/%hpxthread%.%hpxphase%/%hpxcomponent%) P%parentloc%/%hpxparent%.%hpxparentphase% %time%($hh:$mm.$ss.$mili) [%idx%]|\\n}

The logging level is taken from the environment variable HPX_LOGLEVEL and defaults to zero, e.g., no logging. The default logging destination is read from the environment variable HPX_LOGDESTINATION On any of the localities it defaults to console, which redirects all generated logging output to the console instance of an application. The following table lists the possible destinations for any logging output. It is possible to specify more than one destination separated by whitespace.

Table 7 Logging destinations#

Logging destination

Description

file(<filename>)

Directs all output to a file with the given <filename>.

cout

Directs all output to the local standard output of the application instance on this locality.

cerr

Directs all output to the local standard error output of the application instance on this locality.

console

Directs all output to the console instance of the application. The console instance has its logging destinations configured separately.

android_log

Directs all output to the (Android) system log (available on Android systems only).

The logging format is read from the environment variable HPX_LOGFORMAT, and it defaults to a complex format description. This format consists of several placeholder fields (for instance %locality%), which will be replaced by concrete values when the logging output is generated. All other information is transferred verbatim to the output. The table below describes the available field placeholders. The separator character | separates the logging message prefix formatted as shown and the actual log message which will replace the separator.

Table 8 Available field placeholders#

Name

Description

locality

The id of the locality on which the logging message was generated.

hpxthread

The id of the HPX thread generating this logging output.

hpxphase

The phase 1 of the HPX thread generating this logging output.

hpxcomponent

The local virtual address of the component which the current HPX thread is accessing.

parentloc

The id of the locality where the HPX thread was running that initiated the current HPX thread. The current HPX thread is generating this logging output.

hpxparent

The id of the HPX thread that initiated the current HPX thread. The current HPX thread is generating this logging output.

hpxparentphase

The phase of the HPX thread when it initiated the current HPX thread. The current HPX thread is generating this logging output.

time

The time stamp for this logging outputline as generated by the source locality.

idx

The sequence number of the logging output line as generated on the source locality.

osthread

The sequence number of the OS thread that executes the current HPX thread.

Note

Not all of the field placeholder may be expanded for all generated logging output. If no value is available for a particular field, it is replaced with a sequence of '-' characters.

Here is an example line from a logging output generated by one of the HPX examples (please note that this is generated on a single line, without a line break):

(T00000000/0000000002d46f90.01/00000000009ebc10) P--------/0000000002d46f80.02 17:49.37.320 [000000000000004d]
    <info>  [RT] successfully created component {0000000100ff0001, 0000000000030002} of type: component_barrier[7(3)]

The default settings for the general logging category on the console is shown here:

[hpx.logging.console]
level = ${HPX_LOGLEVEL:$[hpx.logging.level]}
destination = ${HPX_CONSOLE_LOGDESTINATION:file(hpx.$[system.pid].log)}
format = ${HPX_CONSOLE_LOGFORMAT:|}

These settings define how the logging is customized once the logging output is received by the console instance of an application. The logging level is read from the environment variable HPX_LOGLEVEL (as set for the console instance of the application). The level defaults to the same values as the corresponding settings in the general logging configuration shown before. The destination on the console instance is set to be a file that’s name is generated based on its OS process id. Setting the environment variable HPX_CONSOLE_LOGDESTINATION allows customization of the naming scheme for the output file. The logging format is set to leave the original logging output unchanged, as received from one of the localities the application runs on.

HPX Command Line Options#

The predefined command line options for any application using hpx::init are described in the following subsections.

HPX options (allowed on command line only)#
--hpx:help#

Print out program usage (default: this message). Possible values: full (additionally prints options from components).

--hpx:version#

Print out HPX version and copyright information.

--hpx:info#

Print out HPX configuration information.

--hpx:options-file arg#

Specify a file containing command line options (alternatively: @filepath).

HPX options (additionally allowed in an options file)#
--hpx:worker#

Run this instance in worker mode.

--hpx:console#

Run this instance in console mode.

--hpx:connect#

Run this instance in worker mode, but connecting late.

--hpx:run-agas-server#

Run AGAS server as part of this runtime instance.

--hpx:run-hpx-main#

Run the hpx_main function, regardless of locality mode.

--hpx:hpx arg#

The IP address the HPX parcelport is listening on, expected format: address:port (default: 127.0.0.1:7910).

--hpx:agas arg#

The IP address the AGAS root server is running on, expected format: address:port (default: 127.0.0.1:7910).

--hpx:run-agas-server-only#

Run only the AGAS server.

--hpx:nodefile arg#

The file name of a node file to use (list of nodes, one node name per line and core).

--hpx:nodes arg#

The (space separated) list of the nodes to use (usually this is extracted from a node file).

--hpx:endnodes#

This can be used to end the list of nodes specified using the option --hpx:nodes.

--hpx:ifsuffix arg#

Suffix to append to host names in order to resolve them to the proper network interconnect.

--hpx:ifprefix arg#

Prefix to prepend to host names in order to resolve them to the proper network interconnect.

--hpx:iftransform arg#

Sed-style search and replace (s/search/replace/) used to transform host names to the proper network interconnect.

--hpx:force_ipv4#

Network hostnames will be resolved to ipv4 addresses instead of using the first resolved endpoint. This is especially useful on Windows where the local hostname will resolve to an ipv6 address while remote network hostnames are commonly resolved to ipv4 addresses.

--hpx:localities arg#

The number of localities to wait for at application startup (default: 1).

--hpx:node arg#

Number of the node this locality is run on (must be unique).

--hpx:ignore-batch-env#

Ignore batch environment variables.

--hpx:expect-connecting-localities#

This locality expects other localities to dynamically connect (this is implied if the number of initial localities is larger than 1).

--hpx:pu-offset#

The first processing unit this instance of HPX should be run on (default: 0).

--hpx:pu-step#

The step between used processing unit numbers for this instance of HPX (default: 1).

--hpx:threads arg#

The number of operating system threads to spawn for this HPX locality. Possible values are: numeric values 1, 2, 3 and so on, all (which spawns one thread per processing unit, includes hyperthreads), or cores (which spawns one thread per core) (default: cores).

--hpx:cores arg#

The number of cores to utilize for this HPX locality (default: all, i.e., the number of cores is based on the number of threads --hpx:threads assuming --hpx:bind=compact.

--hpx:affinity arg#

The affinity domain the OS threads will be confined to, possible values: pu, core, numa, machine (default: pu).

--hpx:bind arg#

he detailed affinity description for the OS threads, see More details about HPX command line options for a detailed description of possible values. Do not use with --hpx:pu-step, --hpx:pu-offset or --hpx:affinity options. Implies --hpx:numa-sensitive (--hpx:bind=none) disables defining thread affinities).

--hpx:use-process-mask#

Use the process mask to restrict available hardware resources (implies --hpx:ignore-batch-env).

--hpx:print-bind#

Print to the console the bit masks calculated from the arguments specified to all --hpx:bind options.

--hpx:queuing arg#

The queue scheduling policy to use. Options are local, local-priority-fifo, local-priority-lifo, static, static-priority, abp-priority-fifo, local-workrequesting-fifo, local-workrequesting-lifo local-workrequesting-mc, and abp-priority-lifo (default: local-priority-fifo).

--hpx:high-priority-threads arg#

The number of operating system threads maintaining a high priority queue (default: number of OS threads), valid for --hpx:queuing=abp-priority, --hpx:queuingstatic-priority and --hpx:queuinglocal-priority only.

--hpx:numa-sensitive#

Makes the scheduler NUMA sensitive.

HPX configuration options#
--hpx:app-config arg#

Load the specified application configuration (ini) file.

--hpx:config arg#

Load the specified HPX configuration (ini) file.

--hpx:ini arg#

Add a configuration definition to the default runtime configuration.

--hpx:exit#

Exit after configuring the runtime.

HPX debugging options#
--hpx:list-symbolic-names#

List all registered symbolic names after startup.

--hpx:list-component-types#

List all dynamic component types after startup.

--hpx:dump-config-initial#

Print the initial runtime configuration.

--hpx:dump-config#

Print the final runtime configuration.

--hpx:debug-hpx-log [arg]#

Enable all messages on the HPX log channel and send all HPX logs to the target destination (default: cout).

--hpx:debug-agas-log [arg]#

Enable all messages on the AGAS log channel and send all AGAS logs to the target destination (default: cout).

--hpx:debug-parcel-log [arg]#

Enable all messages on the parcel transport log channel and send all parcel transport logs to the target destination (default: cout).

--hpx:debug-timing-log [arg]#

Enable all messages on the timing log channel and send all timing logs to the target destination (default: cout).

--hpx:debug-app-log [arg]#

Enable all messages on the application log channel and send all application logs to the target destination (default: cout).

--hpx:debug-clp#

Debug command line processing.

--hpx:attach-debugger arg#

Wait for a debugger to be attached, possible arg values: startup or exception (default: startup)

Command line argument shortcuts#

Additionally, the following shortcuts are available from every HPX application.

Table 9 Predefined command line option shortcuts#

Shortcut option

Equivalent long option

-a

--hpx:agas

-c

--hpx:console

-h

--hpx:help

-I

--hpx:ini

-l

--hpx:localities

-p

--hpx:app-config

-q

--hpx:queuing

-r

--hpx:run-agas-server

-t

--hpx:threads

-v

--hpx:version

-w

--hpx:worker

-x

--hpx:hpx

-0

--hpx:node=0

-1

--hpx:node=1

-2

--hpx:node=2

-3

--hpx:node=3

-4

--hpx:node=4

-5

--hpx:node=5

-6

--hpx:node=6

-7

--hpx:node=7

-8

--hpx:node=8

-9

--hpx:node=9

Note

The short options listed above are disabled by default if the application is built using #include <hpx/hpx_main.hpp>. See Re-use the main() function as the main HPX entry point for more information. The rationale behind this is that in this case the user’s application may handle its own command line options, since HPX passes all unknown options to main(). Short options like -t are prone to create ambiguities regarding what the application will support. Hence, the user should instead rely on the corresponding long options like --hpx:threads in such a case.

It is possible to define your own shortcut options. In fact, all of the shortcuts listed above are pre-defined using the technique described here. Also, it is possible to redefine any of the pre-defined shortcuts to expand differently as well.

Shortcut options are obtained from the internal configuration database. They are stored as key-value properties in a special properties section named hpx.commandline. You can define your own shortcuts by adding the corresponding definitions to one of the ini configuration files as described in the section Configuring HPX applications. For instance, in order to define a command line shortcut --p, which should expand to -hpx:print-counter, the following configuration information needs to be added to one of the ini configuration files:

[hpx.commandline.aliases]
--pc = --hpx:print-counter

Note

Any arguments for shortcut options passed on the command line are retained and passed as arguments to the corresponding expanded option. For instance, given the definition above, the command line option:

--pc=/threads{locality#0/total}/count/cumulative

would be expanded to:

--hpx:print-counter=/threads{locality#0/total}/count/cumulative

Important

Any shortcut option should either start with a single '-' or with two '--' characters. Shortcuts starting with a single '-' are interpreted as short options (i.e., everything after the first character following the '-' is treated as the argument). Shortcuts starting with '--' are interpreted as long options. No other shortcut formats are supported.

Specifying options for single localities only#

For runs involving more than one locality, it is sometimes desirable to supply specific command line options to single localities only. When the HPX application is launched using a scheduler (like PBS; for more details see section How to use HPX applications with PBS), specifying dedicated command line options for single localities may be desirable. For this reason all of the command line options that have the general format --hpx:<some_key> can be used in a more general form: --hpx:<N>:<some_key>, where <N> is the number of the locality this command line option will be applied to; all other localities will simply ignore the option. For instance, the following PBS script passes the option --hpx:pu-offset=4 to the locality '1' only.

#!/bin/bash
#
#PBS -l nodes=2:ppn=4

APP_PATH=~/packages/hpx/bin/hello_world_distributed
APP_OPTIONS=

pbsdsh -u $APP_PATH $APP_OPTIONS --hpx:1:pu-offset=4 --hpx:nodes=`cat $PBS_NODEFILE`

Caution

If the first application specific argument (inside $APP_OPTIONS) is a non-option (i.e., does not start with a - or a --), then it must be placed before the option --hpx:nodes, which, in this case, should be the last option on the command line.

Alternatively, use the option --hpx:endnodes to explicitly mark the end of the list of node names:

$ pbsdsh -u $APP_PATH --hpx:1:pu-offset=4 --hpx:nodes=`cat $PBS_NODEFILE` --hpx:endnodes $APP_OPTIONS
More details about HPX command line options#

This section documents the following list of the command line options in more detail:

The command line option --hpx:bind#

This command line option allows one to specify the required affinity of the HPX worker threads to the underlying processing units. As a result the worker threads will run only on the processing units identified by the corresponding bind specification. The affinity settings are to be specified using --hpx:bind=<BINDINGS>, where <BINDINGS> have to be formatted as described below.

In addition to the syntax described below, one can use --hpx:bind=none to disable all binding of any threads to a particular core. This is mostly supported for debugging purposes.

The specified affinities refer to specific regions within a machine hardware topology. In order to understand the hardware topology of a particular machine, it may be useful to run the lstopo tool, which is part of Portable Hardware Locality (HWLOC), to see the reported topology tree. Seeing and understanding a topology tree will definitely help in understanding the concepts that are discussed below.

Affinities can be specified using hwloc tuples. Tuples of hwloc objects and associated indexes can be specified in the form object:index, object:index-index or object:index,...,index. Hwloc objects represent types of mapped items in a topology tree. Possible values for objects are socket, numanode, core and pu (processing unit). Indexes are non-negative integers that specify a unique physical object in a topology tree using its logical sequence number.

Chaining multiple tuples together in the more general form object1:index1[.object2:index2[...]] is permissible. While the first tuple’s object may appear anywhere in the topology, the Nth tuple’s object must have a shallower topology depth than the (N+1)th tuple’s object. Put simply: as you move right in a tuple chain, objects must go deeper in the topology tree. Indexes specified in chained tuples are relative to the scope of the parent object. For example, socket:0.core:1 refers to the second core in the first socket (all indices are zero based).

Multiple affinities can be specified using several --hpx:bind command line options or by appending several affinities separated by a ';'. By default, if multiple affinities are specified, they are added.

"all" is a special affinity consisting in the entire current topology.

Note

All “names” in an affinity specification, such as thread, socket, numanode, pu or all, can be abbreviated. Thus, the affinity specification threads:0-3=socket:0.core:1.pu:1 is fully equivalent to its shortened form t:0-3=s:0.c:1.p:1.

Here is a full grammar describing the possible format of mappings:

mappings     ::=  distribution | mapping (";" mapping)*
distribution ::=  "compact" | "scatter" | "balanced" | "numa-balanced"
mapping      ::=  thread_spec "=" pu_specs
thread_spec  ::=  "thread:" range_specs
pu_specs     ::=  pu_spec ("." pu_spec)*
pu_spec      ::=  type ":" range_specs | "~" pu_spec
range_specs  ::=  range_spec ("," range_spec)*
range_spec   ::=  int | int "-" int | "all"
type         ::=  "socket" | "numanode" | "core" | "pu"

The following example assumes a system with at least 4 cores, where each core has more than 1 processing unit (hardware threads). Running hello_world_distributed with 4 OS threads (on 4 processing units), where each of those threads is bound to the first processing unit of each of the cores, can be achieved by invoking:

$ hello_world_distributed -t4 --hpx:bind=thread:0-3=core:0-3.pu:0

Here, thread:0-3 specifies the OS threads used to define affinity bindings, and core:0-3.pu: defines that for each of the cores (core:0-3) only their first processing unit pu:0 should be used.

Note

The command line option --hpx:print-bind can be used to print the bitmasks generated from the affinity mappings as specified with --hpx:bind. For instance, on a system with hyperthreading enabled (i.e. 2 processing units per core), the command line:

$ hello_world_distributed -t4 --hpx:bind=thread:0-3=core:0-3.pu:0 --hpx:print-bind

will cause this output to be printed:

0: PU L#0(P#0), Core L#0, Socket L#0, Node L#0(P#0)
1: PU L#2(P#2), Core L#1, Socket L#0, Node L#0(P#0)
2: PU L#4(P#4), Core L#2, Socket L#0, Node L#0(P#0)
3: PU L#6(P#6), Core L#3, Socket L#0, Node L#0(P#0)

where each bit in the bitmasks corresponds to a processing unit the listed worker thread will be bound to run on.

The difference between the four possible predefined distribution schemes (compact, scatter, balanced and numa-balanced) is best explained with an example. Imagine that we have a system with 4 cores and 4 hardware threads per core on 2 sockets. If we place 8 threads the assignments produced by the compact, scatter, balanced and numa-balanced types are shown in the figure below. Notice that compact does not fully utilize all the cores in the system. For this reason it is recommended that applications are run using the scatter or balanced/numa-balanced options in most cases.

_images/affinities.png

Fig. 7 Schematic of thread affinity type distributions.#

In addition to the predefined distributions it is possible to restrict the resources used by HPX to the process CPU mask. The CPU mask is typically set by e.g. MPI and batch environments. Using the command line option --hpx:use-process-mask makes HPX act as if only the processing units in the CPU mask are available for use by HPX. The number of threads is automatically determined from the CPU mask. The number of threads can still be changed manually using this option, but only to a number less than or equal to the number of processing units in the CPU mask. The option --hpx:print-bind is useful in conjunction with --hpx:use-process-mask to make sure threads are placed as expected.

1

The phase of a HPX-thread counts how often this thread has been activated.

Writing single-node applications#

Being a C++ Standard Library for Concurrency and Parallelism, HPX implements all of the corresponding facilities as defined by the C++ Standard but also those which are proposed as part of the ongoing C++ standardization process. This section focuses on the features available in HPX for parallel and concurrent computation on a single node, although many of the features presented here are also implemented to work in the distributed case.

Synchronization objects#

The following objects are providing synchronization for HPX applications:

  1. Barrier

  2. Condition variable

  3. Latch

  4. Mutex

  5. Shared mutex

  6. Semaphore

  7. Composable guards

Barrier#

Barriers are used for synchronizing multiple threads. They provide a synchronization point, where all threads must wait until they have all reached the barrier, before they can continue execution. This allows multiple threads to work together to solve a common task, and ensures that no thread starts working on the next task until all threads have completed the current task. This ensures that all threads are in the same state before performing any further operations, leading to a more consistent and accurate computation.

Unlike latches, barriers are reusable: once the participating threads are released from a barrier’s synchronization point, they can re-use the same barrier. It is thus useful for managing repeated tasks, or phases of a larger task, that are handled by multiple threads. The code below shows how barriers can be used to synchronize two threads:

#include <hpx/barrier.hpp>
#include <hpx/future.hpp>
#include <hpx/init.hpp>

#include <iostream>

int hpx_main()
{
    hpx::barrier b(2);

    hpx::future<void> f1 = hpx::async([&b]() {
        std::cout << "Thread 1 started." << std::endl;
        // Do some computation
        b.arrive_and_wait();
        // Continue with next task
        std::cout << "Thread 1 finished." << std::endl;
    });

    hpx::future<void> f2 = hpx::async([&b]() {
        std::cout << "Thread 2 started." << std::endl;
        // Do some computation
        b.arrive_and_wait();
        // Continue with next task
        std::cout << "Thread 2 finished." << std::endl;
    });

    f1.get();
    f2.get();

    return hpx::local::finalize();
}

int main(int argc, char* argv[])
{
    return hpx::local::init(hpx_main, argc, argv);
}

In this example, two hpx::future objects are created, each representing a separate thread of execution. The wait function of the hpx::barrier object is called by each thread. The threads will wait at the barrier until both have reached it. Once both threads have reached the barrier, they can continue with their next task.

Condition variable#

A condition variable is a synchronization primitive in HPX that allows a thread to wait for a specific condition to be satisfied before continuing execution. It is typically used in conjunction with a mutex or a lock to protect shared data that is being modified by multiple threads. Hence, it blocks one or more threads until another thread both modifies a shared variable (the condition) and notifies the condition_variable. The code below shows how two threads modifying the shared variable data can be synchronized using the condition_variable:

#include <hpx/condition_variable.hpp>
#include <hpx/init.hpp>
#include <hpx/mutex.hpp>
#include <hpx/thread.hpp>

#include <iostream>
#include <string>

hpx::condition_variable cv;
hpx::mutex m;
std::string data;
bool ready = false;
bool processed = false;

void worker_thread()
{
    // Wait until the main thread signals that data is ready
    std::unique_lock<hpx::mutex> lk(m);
    cv.wait(lk, [] { return ready; });

    // Access the shared resource
    std::cout << "Worker thread: Processing data...\n";
    data = "Test data after";

    // Send data back to the main thread
    processed = true;
    std::cout << "Worker thread: data processing is complete\n";

    // Manual unlocking is done before notifying, to avoid waking up
    // the waiting thread only to block again
    lk.unlock();
    cv.notify_one();
}

int hpx_main()
{
    hpx::thread worker(worker_thread);

    // Do some work
    std::cout << "Main thread: Preparing data...\n";
    data = "Test data before";
    hpx::this_thread::sleep_for(std::chrono::seconds(1));
    std::cout << "Main thread: Data before processing = " << data << '\n';

    // Signal that data is ready and send data to worker thread
    {
        std::lock_guard<hpx::mutex> lk(m);
        ready = true;
        std::cout << "Main thread: Data is ready...\n";
    }
    cv.notify_one();

    // Wait for the worker thread to finish
    {
        std::unique_lock<hpx::mutex> lk(m);
        cv.wait(lk, [] { return processed; });
    }
    std::cout << "Main thread: Data after processing = " << data << '\n';
    worker.join();

    return hpx::local::finalize();
}

int main(int argc, char* argv[])
{
    return hpx::local::init(hpx_main, argc, argv);
}

The main thread of the code above starts by creating a worker thread and preparing the shared variable data. Once the data is ready, the main thread acquires a lock on the mutex m using std::lock_guard<hpx::mutex> lk(m) and sets the ready flag to true, then signals the worker thread to start processing by calling cv.notify_one(). The cv.wait() call in the main thread then blocks until the worker thread signals that processing is complete by setting the processed flag.

The worker thread starts by acquiring a lock on the mutex m to ensure exclusive access to the shared data. The cv.wait() call blocks the thread until the ready flag is set by the main thread. Once this is true, the worker thread accesses the shared data resource, processes it, and sets the processed flag to indicate completion. The mutex is then unlocked using lk.unlock() and the cv.notify_one() call signals the main thread to resume execution. Finally, the new data is printed by the main thread to the console.

Latch#

A latch is a downward counter which can be used to synchronize threads. The value of the counter is initialized on creation. Threads may block on the latch until the counter is decremented to zero. There is no possibility to increase or reset the counter, which makes the latch a single-use barrier.

In HPX, a latch is implemented as a counting semaphore, which can be initialized with a specific count value and decremented each time a thread reaches the latch. When the count value reaches zero, all waiting threads are unblocked and allowed to continue execution. The code below shows how latch can be used to synchronize 16 threads:

std::ptrdiff_t num_threads = 16;

///////////////////////////////////////////////////////////////////////////////
void wait_for_latch(hpx::latch& l)
{
    l.arrive_and_wait();
}

///////////////////////////////////////////////////////////////////////////////
int hpx_main(hpx::program_options::variables_map& vm)
{
    num_threads = vm["num-threads"].as<std::ptrdiff_t>();

    hpx::latch l(num_threads + 1);

    std::vector<hpx::future<void>> results;
    for (std::ptrdiff_t i = 0; i != num_threads; ++i)
        results.push_back(hpx::async(&wait_for_latch, std::ref(l)));

    // Wait for all threads to reach this point.
    l.arrive_and_wait();

    hpx::wait_all(results);

    return hpx::local::finalize();
}

In the above code, the hpx_main function creates a latch object l with a count of num_threads + 1 and num_threads number of threads using hpx::async. These threads call the wait_for_latch function and pass the reference to the latch object. In the wait_for_latch function, the thread calls the arrive_and_wait method on the latch, which decrements the count of the latch and causes the thread to wait until the count reaches zero. Finally, the main thread waits for all the threads to arrive at the latch by calling the arrive_and_wait method and then waits for all the threads to finish by calling the hpx::wait_all method.

Mutex#

A mutex (short for “mutual exclusion”) is a synchronization primitive in HPX used to control access to a shared resource, ensuring that only one thread can access it at a time. A mutex is used to protect data structures from race conditions and other synchronization-related issues. When a thread acquires a mutex, other threads that try to access the same resource will be blocked until the mutex is released. The code below shows the basic use of mutexes:

#include <hpx/future.hpp>
#include <hpx/init.hpp>
#include <hpx/mutex.hpp>

#include <iostream>

int hpx_main()
{
    hpx::mutex m;

    hpx::future<void> f1 = hpx::async([&m]() {
        std::scoped_lock sl(m);
        std::cout << "Thread 1 acquired the mutex" << std::endl;
    });

    hpx::future<void> f2 = hpx::async([&m]() {
        std::scoped_lock sl(m);
        std::cout << "Thread 2 acquired the mutex" << std::endl;
    });

    hpx::wait_all(f1, f2);

    return hpx::local::finalize();
}

int main(int argc, char* argv[])
{
    return hpx::local::init(hpx_main, argc, argv);
}

In this example, two HPX threads created using hpx::async are acquiring a hpx::mutex m. std::scoped_lock sl(m) is used to take ownership of the given mutex m. When control leaves the scope in which the scoped_lock object was created, the scoped_lock is destructed and the mutex is released.

Attention

A common way to acquire and release mutexes is by using the function m.lock() before accessing the shared resource, and m.unlock() called after the access is complete. However, these functions may lead to deadlocks in case of exception(s). That is, if an exception happens when the mutex is locked then the code that unlocks the mutex will never be executed, the lock will remain held by the thread that acquired it, and other threads will be unable to access the shared resource. This can cause a deadlock if the other threads are also waiting to acquire the same lock. For this reason, we suggest you use std::scoped_lock, which prevents this issue by releasing the lock when control leaves the scope in which the scoped_lock object was created.

Shared mutex#

A shared mutex is a synchronization primitive that can be used to protect shared data from being simultaneously accessed by multiple threads. In contrast to other mutex types which facilitate exclusive access, a shared_mutex has two levels of access:

  • Exclusive access prevents any other thread from acquiring the mutex, just as with the normal mutex. It does not matter if the other thread tries to acquire shared or exclusive access.

  • Shared access allows multiple threads to acquire the mutex, but all of them only in shared mode. Exclusive access is not granted until all of the previous shared holders have returned the mutex (typically, as long as an exclusive request is waiting, new shared ones are queued to be granted after the exclusive access).

Shared mutexes are especially useful when shared data can be safely read by any number of threads simultaneously, but a thread may only write the same data when no other thread is reading or writing at the same time. A typical scenario is a database: The data can be read simultaneously by different threads with no problem. However, modification of the database is critical: if some threads read data while another one is writing, the threads reading may receive inconsistent data. Hence, while a thread is writing, reading should not be allowed. After writing is complete, reads can occur simultaneously again. The code below shows how shared_mutex can be used to synchronize reads and writes:

int const writers = 3;
int const readers = 3;
int const cycles = 10;

using std::chrono::milliseconds;

int hpx_main()
{
    std::vector<hpx::thread> threads;
    std::atomic<bool> ready(false);
    hpx::shared_mutex stm;

    for (int i = 0; i < writers; ++i)
    {
        threads.emplace_back([&ready, &stm, i] {
            std::mt19937 urng(static_cast<std::uint32_t>(std::time(nullptr)));
            std::uniform_int_distribution<int> dist(1, 1000);

            while (!ready)
            { /*** wait... ***/
            }

            for (int j = 0; j < cycles; ++j)
            {
                // scope of unique_lock
                {
                    std::unique_lock<hpx::shared_mutex> ul(stm);

                    std::cout << "^^^ Writer " << i << " starting..."
                              << std::endl;
                    hpx::this_thread::sleep_for(milliseconds(dist(urng)));
                    std::cout << "vvv Writer " << i << " finished."
                              << std::endl;
                }

                hpx::this_thread::sleep_for(milliseconds(dist(urng)));
            }
        });
    }

    for (int i = 0; i < readers; ++i)
    {
        int k = writers + i;
        threads.emplace_back([&ready, &stm, k, i] {
            HPX_UNUSED(k);
            std::mt19937 urng(static_cast<std::uint32_t>(std::time(nullptr)));
            std::uniform_int_distribution<int> dist(1, 1000);

            while (!ready)
            { /*** wait... ***/
            }

            for (int j = 0; j < cycles; ++j)
            {
                // scope of shared_lock
                {
                    std::shared_lock<hpx::shared_mutex> sl(stm);

                    std::cout << "Reader " << i << " starting..." << std::endl;
                    hpx::this_thread::sleep_for(milliseconds(dist(urng)));
                    std::cout << "Reader " << i << " finished." << std::endl;
                }
                hpx::this_thread::sleep_for(milliseconds(dist(urng)));
            }
        });
    }

    ready = true;
    for (auto& t : threads)
        t.join();

    return hpx::local::finalize();
}

The above code creates writers and readers threads, each of which will perform cycles of operations. Both the writer and reader threads use the hpx::shared_mutex object stm to synchronize access to a shared resource.

  • For the writer threads, a unique_lock on the shared mutex is acquired before each write operation and is released after control leaves the scope in which the unique_lock object was created.

  • For the reader threads, a shared_lock on the shared mutex is acquired before each read operation and is released after control leaves the scope in which the shared_lock object was created.

Before each operation, both the reader and writer threads sleep for a random time period, which is generated using a random number generator. The random time period simulates the processing time of the operation.

Semaphore#

Semaphores are a synchronization mechanism used to control concurrent access to a shared resource. The two types of semaphores are:

  • counting semaphore: it has a counter that is bigger than zero. The counter is initialized in the constructor. Acquiring the semaphore decreases the counter and releasing the semaphore increases the counter. If a thread tries to acquire the semaphore when the counter is zero, the thread will block until another thread increments the counter by releasing the semaphore. Unlike hpx::mutex, an hpx::counting_semaphore is not bound to a thread, which means that the acquire and release call of a semaphore can happen on different threads.

  • binary semaphore: it is an alias for a hpx::counting_semaphore<1>. In this case, the least maximal value is 1. hpx::binary_semaphore can be used to implement locks.

#include <hpx/init.hpp>
#include <hpx/semaphore.hpp>
#include <hpx/thread.hpp>

#include <iostream>

// initialize the semaphore with a count of 3
hpx::counting_semaphore<> semaphore(3);

void worker()
{
    semaphore.acquire();    // decrement the semaphore's count
    std::cout << "Entering critical section" << std::endl;
    hpx::this_thread::sleep_for(std::chrono::seconds(1));
    semaphore.release();    // increment the semaphore's count
    std::cout << "Exiting critical section" << std::endl;
}

int hpx_main()
{
    hpx::thread t1(worker);
    hpx::thread t2(worker);
    hpx::thread t3(worker);
    hpx::thread t4(worker);
    hpx::thread t5(worker);

    t1.join();
    t2.join();
    t3.join();
    t4.join();
    t5.join();

    return hpx::local::finalize();
}

int main(int argc, char* argv[])
{
    return hpx::local::init(hpx_main, argc, argv);
}

In this example, the counting semaphore is initialized to the value of 3. This means that up to 3 threads can access the critical section (the section of code inside the worker() function) at the same time. When a thread enters the critical section, it acquires the semaphore, which decrements the count, while when it exits the critical section, it releases the semaphore, incrementing thus the count. The worker() function simulates a critical section by acquiring the semaphore, sleeping for 1 second and then releasing the semaphore.

In the main function, 5 worker threads are created and started, each trying to enter the critical section. If the count of the semaphore is already 0, a worker will wait until another worker releases the semaphore (increasing its value).

Composable guards#

Composable guards operate in a manner similar to locks, but are applied only to asynchronous functions. The guard (or guards) is automatically locked at the beginning of a specified task and automatically unlocked at the end. Because guards are never added to an existing task’s execution context, the calling of guards is freely composable and can never deadlock.

To call an application with a single guard, simply declare the guard and call run_guarded() with a function (task):

hpx::lcos::local::guard gu;
run_guarded(gu,task);

If a single method needs to run with multiple guards, use a guard set:

std::shared_ptr<hpx::lcos::local::guard> gu1(new hpx::lcos::local::guard());
std::shared_ptr<hpx::lcos::local::guard> gu2(new hpx::lcos::local::guard());
gs.add(*gu1);
gs.add(*gu2);
run_guarded(gs,task);

Guards use two atomic operations (which are not called repeatedly) to manage what they do, so overhead should be extremely low.

Execution control#

The following objects are providing control of the execution in HPX applications:

  1. Futures

  2. Channels

  3. Task blocks

  4. Task groups

  5. Threads

Futures#

Futures are a mechanism to represent the result of a potentially asynchronous operation. A future is a type that represents a value that will become available at some point in the future, and it can be used to write asynchronous and parallel code. Futures can be returned from functions that perform time-consuming operations, allowing the calling code to continue executing while the function performs its work. The value of the future is set when the operation completes and can be accessed later. Futures are used in HPX to write asynchronous and parallel code. Below is an example demonstrating different features of futures:

#include <hpx/assert.hpp>
#include <hpx/future.hpp>
#include <hpx/hpx_main.hpp>
#include <hpx/tuple.hpp>

#include <iostream>
#include <utility>

int main()
{
    // Asynchronous execution with futures
    hpx::future<void> f1 = hpx::async(hpx::launch::async, []() {});
    hpx::shared_future<int> f2 =
        hpx::async(hpx::launch::async, []() { return 42; });
    hpx::future<int> f3 =
        f2.then([](hpx::shared_future<int>&& f) { return f.get() * 3; });

    hpx::promise<double> p;
    auto f4 = p.get_future();
    HPX_ASSERT(!f4.is_ready());
    p.set_value(123.45);
    HPX_ASSERT(f4.is_ready());

    hpx::packaged_task<int()> t([]() { return 43; });
    hpx::future<int> f5 = t.get_future();
    HPX_ASSERT(!f5.is_ready());
    t();
    HPX_ASSERT(f5.is_ready());

    // Fire-and-forget
    hpx::post([]() {
        std::cout << "This will be printed later\n" << std::flush;
    });

    // Synchronous execution
    hpx::sync([]() {
        std::cout << "This will be printed immediately\n" << std::flush;
    });

    // Combinators
    hpx::future<double> f6 = hpx::async([]() { return 3.14; });
    hpx::future<double> f7 = hpx::async([]() { return 42.0; });
    std::cout
        << hpx::when_all(f6, f7)
               .then([](hpx::future<
                         hpx::tuple<hpx::future<double>, hpx::future<double>>>
                             f) {
                   hpx::tuple<hpx::future<double>, hpx::future<double>> t =
                       f.get();
                   double pi = hpx::get<0>(t).get();
                   double r = hpx::get<1>(t).get();
                   return pi * r * r;
               })
               .get()
        << std::endl;

    // Easier continuations with dataflow; it waits for all future or
    // shared_future arguments before executing the continuation, and also
    // accepts non-future arguments
    hpx::future<double> f8 = hpx::async([]() { return 3.14; });
    hpx::future<double> f9 = hpx::make_ready_future(42.0);
    hpx::shared_future<double> f10 = hpx::async([]() { return 123.45; });
    hpx::future<hpx::tuple<double, double>> f11 = hpx::dataflow(
        [](hpx::future<double> a, hpx::future<double> b,
            hpx::shared_future<double> c, double d) {
            return hpx::make_tuple<>(a.get() + b.get(), c.get() / d);
        },
        f8, f9, f10, -3.9);

    // split_future gives a tuple of futures from a future of tuple
    hpx::tuple<hpx::future<double>, hpx::future<double>> f12 =
        hpx::split_future(std::move(f11));
    std::cout << hpx::get<1>(f12).get() << std::endl;

    return 0;
}

The first section of the main function demonstrates how to use futures for asynchronous execution. The first two lines create two futures, one for void and another for an integer, using the hpx::async() function. These futures are executed asynchronously in separate threads using the hpx::launch::async launch policy. The third future is created by chaining the second future using the then() member function. This future multiplies the result of the second future by 3.

The next part of the code demonstrates how to use promises and packaged tasks, which are constructs used for communicating data between threads. The promise class is used to store a value that can be retrieved later using a future. The packaged_task class represents a task that can be executed asynchronously, and its result can be obtained using a future. The last three lines create a packaged task that returns an integer, obtain its future, execute the task, and check whether the future is ready or not.

The code then demonstrates how to use the hpx::post() and hpx::sync() functions for fire-and-forget and synchronous execution, respectively. The hpx::post() function executes a given function asynchronously and returns immediately without waiting for the result. The hpx::sync() function executes a given function synchronously and waits for the result before returning.

Next the code demonstrates the use of combinators, which are higher-order functions that combine two or more futures into a single future. The hpx::when_all() function is used to combine two futures, which return double values, into a tuple of futures. The then() member function is then used to compute the area of a circle using the values of the two futures. The get() member function is used to retrieve the result of the computation.

The last section demonstrates the use of hpx::dataflow(), which is a higher-order function that waits for all the future or shared_future arguments to be ready before executing the continuation. The hpx::make_ready_future() function is used to create a future with a given value. The hpx::split_future() function is used to split a future of a tuple into a tuple of futures. The last line retrieves the value of the second future in the tuple using hpx::get() and prints it to the console.

Extended facilities for futures#

Concurrency is about both decomposing and composing the program from the parts that work well individually and together. It is in the composition of connected and multicore components where today’s C++ libraries are still lacking.

The functionality of std::future offers a partial solution. It allows for the separation of the initiation of an operation and the act of waiting for its result; however, the act of waiting is synchronous. In communication-intensive code this act of waiting can be unpredictable, inefficient and simply frustrating. The example below illustrates a possible synchronous wait using futures:

#include <future>
using namespace std;
int main()
{
    future<int> f = async([]() { return 123; });
    int result = f.get(); // might block
}

For this reason, HPX implements a set of extensions to std::future (as proposed by N4313). This proposal introduces the following key asynchronous operations to hpx::future, hpx::shared_future and hpx::async, which enhance and enrich these facilities.

Table 10 Facilities extending std::future#

Facility

Description

hpx::future::then

In asynchronous programming, it is very common for one asynchronous operation, on completion, to invoke a second operation and pass data to it. The current C++ standard does not allow one to register a continuation to a future. With then, instead of waiting for the result, a continuation is “attached” to the asynchronous operation, which is invoked when the result is ready. Continuations registered using then function will help to avoid blocking waits or wasting threads on polling, greatly improving the responsiveness and scalability of an application.

unwrapping constructor for hpx::future

In some scenarios, you might want to create a future that returns another future, resulting in nested futures. Although it is possible to write code to unwrap the outer future and retrieve the nested future and its result, such code is not easy to write because users must handle exceptions and it may cause a blocking call. Unwrapping can allow users to mitigate this problem by doing an asynchronous call to unwrap the outermost future.

hpx::future::is_ready

There are often situations where a get() call on a future may not be a blocking call, or is only a blocking call under certain circumstances. This function gives the ability to test for early completion and allows us to avoid associating a continuation, which needs to be scheduled with some non-trivial overhead and near-certain loss of cache efficiency.

hpx::make_ready_future

Some functions may know the value at the point of construction. In these cases the value is immediately available, but needs to be returned as a future. By using hpx::make_ready_future a future can be created that holds a pre-computed result in its shared state. In the current standard it is non-trivial to create a future directly from a value. First a promise must be created, then the promise is set, and lastly the future is retrieved from the promise. This can now be done with one operation.

The standard also omits the ability to compose multiple futures. This is a common pattern that is ubiquitous in other asynchronous frameworks and is absolutely necessary in order to make C++ a powerful asynchronous programming language. Not including these functions is synonymous to Boolean algebra without AND/OR.

In addition to the extensions proposed by N4313, HPX adds functions allowing users to compose several futures in a more flexible way.

Table 11 Facilities for composing hpx::futures#

Facility

Description

hpx::when_any, hpx::when_any_n

Asynchronously wait for at least one of multiple future or shared_future objects to finish.

hpx::wait_any, hpx::wait_any_n

Synchronously wait for at least one of multiple future or shared_future objects to finish.

hpx::when_all, hpx::when_all_n

Asynchronously wait for all future and shared_future objects to finish.

hpx::wait_all, hpx::wait_all_n

Synchronously wait for all future and shared_future objects to finish.

hpx::when_some, hpx::when_some_n

Asynchronously wait for multiple future and shared_future objects to finish.

hpx::wait_some, hpx::wait_some_n

Synchronously wait for multiple future and shared_future objects to finish.

hpx::when_each

Asynchronously wait for multiple future and shared_future objects to finish and call a function for each of the future objects as soon as it becomes ready.

hpx::wait_each, hpx::wait_each_n

Synchronously wait for multiple future and shared_future objects to finish and call a function for each of the future objects as soon as it becomes ready.

Channels#

Channels combine communication (the exchange of a value) with synchronization (guaranteeing that two calculations (tasks) are in a known state). A channel can transport any number of values of a given type from a sender to a receiver:

    hpx::lcos::local::channel<int> c;
    hpx::future<int> f = c.get();
    HPX_ASSERT(!f.is_ready());
    c.set(42);
    HPX_ASSERT(f.is_ready());
    std::cout << f.get() << std::endl;

Channels can be handed to another thread (or in case of channel components, to other localities), thus establishing a communication channel between two independent places in the program:

void do_something(hpx::lcos::local::receive_channel<int> c,
    hpx::lcos::local::send_channel<> done)
{
    // prints 43
    std::cout << c.get(hpx::launch::sync) << std::endl;
    // signal back
    done.set();
}

void send_receive_channel()
{
    hpx::lcos::local::channel<int> c;
    hpx::lcos::local::channel<> done;

    hpx::post(&do_something, c, done);

    // send some value
    c.set(43);
    // wait for thread to be done
    done.get().wait();
}

Note how hpx::lcos::local::channel::get without any arguments returns a future which is ready when a value has been set on the channel. The launch policy hpx::launch::sync can be used to make hpx::lcos::local::channel::get block until a value is set and return the value directly.

A channel component is created on one locality and can be sent to another locality using an action. This example also demonstrates how a channel can be used as a range of values:

// channel components need to be registered for each used type (not needed
// for hpx::lcos::local::channel)
HPX_REGISTER_CHANNEL(double)

void channel_sender(hpx::lcos::channel<double> c)
{
    for (double d : c)
        hpx::cout << d << std::endl;
}
HPX_PLAIN_ACTION(channel_sender)

void channel()
{
    // create the channel on this locality
    hpx::lcos::channel<double> c(hpx::find_here());

    // pass the channel to a (possibly remote invoked) action
    hpx::post(channel_sender_action(), hpx::find_here(), c);

    // send some values to the receiver
    std::vector<double> v = {1.2, 3.4, 5.0};
    for (double d : v)
        c.set(d);

    // explicitly close the communication channel (implicit at destruction)
    c.close();
}
Task blocks#

Task blocks in HPX provide a way to structure and organize the execution of tasks in a parallel program, making it easier to manage dependencies between tasks. A task block actually is a group of tasks that can be executed in parallel. Tasks in a task block can depend on other tasks in the same task block. The task block allows the runtime to optimize the execution of tasks, by scheduling them in an optimal order based on the dependencies between them.

The define_task_block, run and the wait functions implemented based on N4755 are based on the task_block concept that is a part of the common subset of the Microsoft Parallel Patterns Library (PPL) and the Intel Threading Building Blocks (TBB) libraries.

These implementations adopt a simpler syntax than exposed by those libraries— one that is influenced by language-based concepts, such as spawn and sync from Cilk++ and async and finish from X10. They improve on existing practice in the following ways:

  • The exception handling model is simplified and more consistent with normal C++ exceptions.

  • Most violations of strict fork-join parallelism can be enforced at compile time (with compiler assistance, in some cases).

  • The syntax allows scheduling approaches other than child stealing.

Consider an example of a parallel traversal of a tree, where a user-provided function compute is applied to each node of the tree, returning the sum of the results:

template <typename Func>
int traverse(node& n, Func && compute)
{
    int left = 0, right = 0;
    define_task_block(
        [&](task_block<>& tr) {
            if (n.left)
                tr.run([&] { left = traverse(*n.left, compute); });
            if (n.right)
                tr.run([&] { right = traverse(*n.right, compute); });
        });

    return compute(n) + left + right;
}

The example above demonstrates the use of two of the functions, hpx::experimental::define_task_block and the hpx::experimental::task_block::run member function of a hpx::experimental::task_block.

The task_block function delineates a region in a program code potentially containing invocations of threads spawned by the run member function of the task_block class. The run function spawns an HPX thread, a unit of work that is allowed to execute in parallel with respect to the caller. Any parallel tasks spawned by run within the task block are joined back to a single thread of execution at the end of the define_task_block. run takes a user-provided function object f and starts it asynchronously—i.e., it may return before the execution of f completes. The HPX scheduler may choose to run f immediately or delay running f until compute resources become available.

A task_block can be constructed only by define_task_block because it has no public constructors. Thus, run can be invoked directly or indirectly only from a user-provided function passed to define_task_block:

void g();

void f(task_block<>& tr)
{
    tr.run(g);          // OK, invoked from within task_block in h
}

void h()
{
    define_task_block(f);
}

int main()
{
    task_block<> tr;    // Error: no public constructor
    tr.run(g);          // No way to call run outside of a define_task_block
    return 0;
}
Extensions for task blocks#
Using execution policies with task blocks#

HPX implements some extensions for task_block beyond the actual standards proposal N4755. The main addition is that a task_block can be invoked with an execution policy as its first argument, very similar to the parallel algorithms.

An execution policy is an object that expresses the requirements on the ordering of functions invoked as a consequence of the invocation of a task block. Enabling passing an execution policy to define_task_block gives the user control over the amount of parallelism employed by the created task_block. In the following example the use of an explicit par execution policy makes the user’s intent explicit:

template <typename Func>
int traverse(node *n, Func&& compute)
{
    int left = 0, right = 0;

    define_task_block(
        execution::par,                // execution::parallel_policy
        [&](task_block<>& tb) {
            if (n->left)
                tb.run([&] { left = traverse(n->left, compute); });
            if (n->right)
                tb.run([&] { right = traverse(n->right, compute); });
        });

    return compute(n) + left + right;
}

This also causes the hpx::experimental::task_block object to be a template in our implementation. The template argument is the type of the execution policy used to create the task block. The template argument defaults to hpx::execution::parallel_policy.

HPX still supports calling hpx::experimental::define_task_block without an explicit execution policy. In this case the task block will run using the hpx::execution::parallel_policy.

HPX also adds the ability to access the execution policy that was used to create a given task_block.

Using executors to run tasks#

Often, users want to be able to not only define an execution policy to use by default for all spawned tasks inside the task block, but also to customize the execution context for one of the tasks executed by task_block::run. Adding an optionally passed executor instance to that function enables this use case:

template <typename Func>
int traverse(node *n, Func&& compute)
{
    int left = 0, right = 0;

    define_task_block(
        execution::par,                // execution::parallel_policy
        [&](auto& tb) {
            if (n->left)
            {
                // use explicitly specified executor to run this task
                tb.run(my_executor(), [&] { left = traverse(n->left, compute); });
            }
            if (n->right)
            {
                // use the executor associated with the par execution policy
                tb.run([&] { right = traverse(n->right, compute); });
            }
        });

    return compute(n) + left + right;
}

HPX still supports calling hpx::experimental::task_block::run without an explicit executor object. In this case the task will be run using the executor associated with the execution policy that was used to call hpx::experimental::define_task_block.

Task groups#

A task group in HPX is a synchronization primitive that allows you to execute a group of tasks concurrently and wait for their completion before continuing. The tasks in an hpx::experimental::task_group can be added dynamically. This is the HPX implementation of tbb::task_group of the Intel Threading Building Blocks (TBB) library.

The example below shows that to use a task group, you simply create an hpx::task_group object and add tasks to it using the run() method. Once all the tasks have been added, you can call the wait() method to synchronize the tasks and wait for them to complete.

#include <hpx/experimental/task_group.hpp>
#include <hpx/init.hpp>

#include <iostream>

void task1()
{
    std::cout << "Task 1 executed." << std::endl;
}

void task2()
{
    std::cout << "Task 2 executed." << std::endl;
}

int hpx_main()
{
    hpx::experimental::task_group tg;

    tg.run(task1);
    tg.run(task2);

    tg.wait();

    std::cout << "All tasks finished!" << std::endl;

    return hpx::local::finalize();
}

int main(int argc, char* argv[])
{
    return hpx::local::init(hpx_main, argc, argv);
}

Note

task groups and task blocks are both ways to group and synchronize parallel tasks, but task groups are used to group multiple tasks together as a single unit, while task blocks are used to execute a loop in parallel, with each iteration of the loop executing in a separate task. If the difference is not clear yet, continue reading.

A task group is a construct that allows multiple parallel tasks to be grouped together as a single unit. The task group provides a way to synchronize all the tasks in the group before continuing with the rest of the program.

A task block, on the other hand, is a parallel loop construct that allows you to execute a loop in parallel, with each iteration of the loop executing in a separate task. The loop iterations are executed in a block, meaning that the loop body is executed as a single task.

Threads#

A thread in HPX refers to a sequence of instructions that can be executed concurrently with other such sequences in multithreading environments, while sharing a same address space. These threads can communicate with each other through various means, such as futures or shared data structures.

The example below demonstrates how to launch multiple threads and synchronize them using a hpx::latch object. It also shows how to query the state of threads and wait for futures to complete.

#include <hpx/future.hpp>
#include <hpx/init.hpp>
#include <hpx/thread.hpp>

#include <functional>
#include <iostream>
#include <vector>

int const num_threads = 10;

///////////////////////////////////////////////////////////////////////////////
void wait_for_latch(hpx::latch& l)
{
    l.arrive_and_wait();
}

int hpx_main()
{
    // Spawn a couple of threads
    hpx::latch l(num_threads + 1);

    std::vector<hpx::future<void>> results;
    results.reserve(num_threads);

    for (int i = 0; i != num_threads; ++i)
        results.push_back(hpx::async(&wait_for_latch, std::ref(l)));

    // Allow spawned threads to reach latch
    hpx::this_thread::yield();

    // Enumerate all suspended threads
    hpx::threads::enumerate_threads(
        [](hpx::threads::thread_id_type id) -> bool {
            std::cout << "thread " << hpx::thread::id(id) << " is "
                      << hpx::threads::get_thread_state_name(
                             hpx::threads::get_thread_state(id))
                      << std::endl;
            return true;    // always continue enumeration
        },
        hpx::threads::thread_schedule_state::suspended);

    // Wait for all threads to reach this point.
    l.arrive_and_wait();

    hpx::wait_all(results);

    return hpx::local::finalize();
}

int main(int argc, char* argv[])
{
    return hpx::local::init(hpx_main, argc, argv);
}

In more detail, the wait_for_latch() function is a simple helper function that waits for a hpx::latch object to be released. At this point we remind that hpx::latch is a synchronization primitive that allows multiple threads to wait for a common event to occur.

In the hpx_main() function, an hpx::latch object is created with a count of num_threads + 1, indicating that num_threads threads need to arrive at the latch before the latch is released. The loop that follows launches num_threads asynchronous operations, each of which calls the wait_for_latch function. The resulting futures are added to the vector.

After the threads have been launched, hpx::this_thread::yield() is called to give them a chance to reach the latch before the program proceeds. Then, the hpx::threads::enumerate_threads function prints the state of each suspended thread, while the next call of l.arrive_and_wait() waits for all the threads to reach the latch. Finally, hpx::wait_all is called to wait for all the futures to complete.

Hint

An advantage of using hpx::thread over other threading libraries is that it is optimized for high-performance parallelism, with support for lightweight threads and task scheduling to minimize thread overhead and maximize parallelism. Additionally, hpx::thread integrates seamlessly with other features of HPX such as futures, promises, and task groups, making it a powerful tool for parallel programming.

Checkout the examples of Shared mutex, Condition variable, Semaphore to see how HPX threads are used in combination with other features.

High level parallel facilities#

In preparation for the upcoming C++ Standards, there are currently several proposals targeting different facilities supporting parallel programming. HPX implements (and extends) some of those proposals. This is well aligned with our strategy to align the APIs exposed from HPX with current and future C++ Standards.

At this point, HPX implements several of the C++ Standardization working papers, most notably N4409 (Working Draft, Technical Specification for C++ Extensions for Parallelism), N4755 (Task Blocks), and N4406 (Parallel Algorithms Need Executors).

Using parallel algorithms#

A parallel algorithm is a function template declared in the namespace hpx::parallel.

All parallel algorithms are very similar in semantics to their sequential counterparts (as defined in the namespace std) with an additional formal template parameter named ExecutionPolicy. The execution policy is generally passed as the first argument to any of the parallel algorithms and describes the manner in which the execution of these algorithms may be parallelized and the manner in which they apply user-provided function objects.

The applications of function objects in parallel algorithms invoked with an execution policy object of type hpx::execution::sequenced_policy or hpx::execution::sequenced_task_policy execute in sequential order. For hpx::execution::sequenced_policy the execution happens in the calling thread.

The applications of function objects in parallel algorithms invoked with an execution policy object of type hpx::execution::parallel_policy or hpx::execution::parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and are indeterminately sequenced within each thread.

Important

It is the caller’s responsibility to ensure correctness, such as making sure that the invocation does not introduce data races or deadlocks.

The example below demonstrates how to perform a sequential and parallel hpx::for_each loop on a vector of integers.

#include <hpx/algorithm.hpp>
#include <hpx/execution.hpp>
#include <hpx/init.hpp>

#include <iostream>
#include <vector>

int hpx_main()
{
    std::vector<int> v{1, 2, 3, 4, 5};

    auto print = [](const int& n) { std::cout << n << ' '; };

    std::cout << "Print sequential: ";
    hpx::for_each(v.begin(), v.end(), print);
    std::cout << '\n';

    std::cout << "Print parallel: ";
    hpx::for_each(hpx::execution::par, v.begin(), v.end(), print);
    std::cout << '\n';

    return hpx::local::finalize();
}

int main(int argc, char* argv[])
{
    return hpx::local::init(hpx_main, argc, argv);
}

The above code uses hpx::for_each to print the elements of the vector v{1, 2, 3, 4, 5}. At first, hpx::for_each() is called without an execution policy, which means that it applies the lambda function print to each element in the vector sequentially. Hence, the elements are printed in order.

Next, hpx::for_each() is called with the hpx::execution::par execution policy, which applies the lambda function print to each element in the vector in parallel. Therefore, the output order of the elements in the vector is not deterministic and may vary from run to run.

Parallel exceptions#

During the execution of a standard parallel algorithm, if temporary memory resources are required by any of the algorithms and no memory is available, the algorithm throws a std::bad_alloc exception.

During the execution of any of the parallel algorithms, if the application of a function object terminates with an uncaught exception, the behavior of the program is determined by the type of execution policy used to invoke the algorithm:

For example, the number of invocations of the user-provided function object in for_each is unspecified. When hpx::for_each is executed sequentially, only one exception will be contained in the hpx::exception_list object.

These guarantees imply that, unless the algorithm has failed to allocate memory and terminated with std::bad_alloc, all exceptions thrown during the execution of the algorithm are communicated to the caller. It is unspecified whether an algorithm implementation will “forge ahead” after encountering and capturing a user exception.

The algorithm may terminate with the std::bad_alloc exception even if one or more user-provided function objects have terminated with an exception. For example, this can happen when an algorithm fails to allocate memory while creating or adding elements to the hpx::exception_list object.

Parallel algorithms#

HPX provides implementations of the following parallel algorithms:

Table 12 Non-modifying parallel algorithms of header hpx/algorithm.hpp#

Name

Description

C++ standard

hpx::adjacent_find

Computes the differences between adjacent elements in a range.

adjacent_find

hpx::all_of

Checks if a predicate is true for all of the elements in a range.

all_any_none_of

hpx::any_of

Checks if a predicate is true for any of the elements in a range.

all_any_none_of

hpx::count

Returns the number of elements equal to a given value.

count

hpx::count_if

Returns the number of elements satisfying a specific criteria.

count_if

hpx::equal

Determines if two sets of elements are the same.

equal

hpx::find

Finds the first element equal to a given value.

find

hpx::find_end

Finds the last sequence of elements in a certain range.

find_end

hpx::find_first_of

Searches for any one of a set of elements.

find_first_of

hpx::find_if

Finds the first element satisfying a specific criteria.

find_if

hpx::find_if_not

Finds the first element not satisfying a specific criteria.

find_if_not

hpx::for_each

Applies a function to a range of elements.

for_each

hpx::for_each_n

Applies a function to a number of elements.

for_each_n

hpx::lexicographical_compare

Checks if a range of values is lexicographically less than another range of values.

lexicographical_compare

hpx::mismatch

Finds the first position where two ranges differ.

mismatch

hpx::none_of

Checks if a predicate is true for none of the elements in a range.

all_any_none_of

hpx::search

Searches for a range of elements.

search

hpx::search_n

Searches for a number consecutive copies of an element in a range.

search_n


Table 13 Modifying parallel algorithms of header hpx/algorithm.hpp#

Name

Description

C++ standard

hpx::copy

Copies a range of elements to a new location.

exclusive_scan

hpx::copy_n

Copies a number of elements to a new location.

copy_n

hpx::copy_if

Copies the elements from a range to a new location for which the given predicate is true

copy

hpx::move

Moves a range of elements to a new location.

move

hpx::fill

Assigns a range of elements a certain value.

fill

hpx::fill_n

Assigns a value to a number of elements.

fill_n

hpx::generate

Saves the result of a function in a range.

generate

hpx::generate_n

Saves the result of N applications of a function.

generate_n

hpx::experimental::reduce_by_key

Performs an inclusive scan on consecutive elements with matching keys, with a reduction to output only the final sum for each key. The key sequence {1,1,1,2,3,3,3,3,1} and value sequence {2,3,4,5,6,7,8,9,10} would be reduced to keys={1,2,3,1}, values={9,5,30,10}.

hpx::remove

Removes the elements from a range that are equal to the given value.

remove

hpx::remove_if

Removes the elements from a range that are equal to the given predicate is false

remove

hpx::remove_copy

Copies the elements from a range to a new location that are not equal to the given value.

remove_copy

hpx::remove_copy_if

Copies the elements from a range to a new location for which the given predicate is false

remove_copy

hpx::replace

Replaces all values satisfying specific criteria with another value.

replace

hpx::replace_if

Replaces all values satisfying specific criteria with another value.

replace

hpx::replace_copy

Copies a range, replacing elements satisfying specific criteria with another value.

replace_copy

hpx::replace_copy_if

Copies a range, replacing elements satisfying specific criteria with another value.

replace_copy

hpx::reverse

Reverses the order elements in a range.

reverse

hpx::reverse_copy

Creates a copy of a range that is reversed.

reverse_copy

hpx::rotate

Rotates the order of elements in a range.

rotate

hpx::rotate_copy

Copies and rotates a range of elements.

rotate_copy

hpx::shift_left

Shifts the elements in the range left by n positions.

shift_left

hpx::shift_right

Shifts the elements in the range right by n positions.

shift_right

hpx::swap_ranges

Swaps two ranges of elements.

swap_ranges

hpx::transform

Applies a function to a range of elements.

transform

hpx::unique

Eliminates all but the first element from every consecutive group of equivalent elements from a range.

unique

hpx::unique_copy

Copies the elements from one range to another in such a way that there are no consecutive equal elements.

unique_copy


Table 14 Set operations on sorted sequences of header hpx/algorithm.hpp#

Name

Description

C++ standard

hpx::merge

Merges two sorted ranges.

merge

hpx::inplace_merge

Merges two ordered ranges in-place.

inplace_merge

hpx::includes

Returns true if one set is a subset of another.

includes

hpx::set_difference

Computes the difference between two sets.

set_difference

hpx::set_intersection

Computes the intersection of two sets.

set_intersection

hpx::set_symmetric_difference

Computes the symmetric difference between two sets.

set_symmetric_difference

hpx::set_union

Computes the union of two sets.

set_union


Table 15 Heap operations of header hpx/algorithm.hpp#

Name

Description

C++ standard

hpx::is_heap

Returns true if the range is max heap.

is_heap

hpx::is_heap_until

Returns the first element that breaks a max heap.

is_heap_until

hpx::make_heap

Constructs a max heap in the range [first, last).

make_heap


Table 16 Minimum/maximum operations of header hpx/algorithm.hpp#

Name

Description

C++ standard

hpx::max_element

Returns the largest element in a range.

max_element

hpx::min_element

Returns the smallest element in a range.

min_element

hpx::minmax_element

Returns the smallest and the largest element in a range.

minmax_element


Table 17 Partitioning Operations of header hpx/algorithm.hpp#

Name

Description

C++ standard

hpx::nth_element

Partially sorts the given range making sure that it is partitioned by the given element

nth_element

hpx::is_partitioned

Returns true if each true element for a predicate precedes the false elements in a range.

is_partitioned

hpx::partition

Divides elements into two groups without preserving their relative order.

partition

hpx::partition_copy

Copies a range dividing the elements into two groups.

partition_copy

hpx::stable_partition

Divides elements into two groups while preserving their relative order.

stable_partition


Table 18 Sorting Operations of header hpx/algorithm.hpp#

Name

Description

C++ standard

hpx::is_sorted

Returns true if each element in a range is sorted.

is_sorted

hpx::is_sorted_until

Returns the first unsorted element.

is_sorted_until

hpx::sort

Sorts the elements in a range.

sort

hpx::stable_sort

Sorts the elements in a range, maintain sequence of equal elements.

stable_sort

hpx::partial_sort

Sorts the first elements in a range.

partial_sort

hpx::partial_sort_copy

Sorts the first elements in a range, storing the result in another range.

partial_sort_copy

hpx::experimental::sort_by_key

Sorts one range of data using keys supplied in another range.


Table 19 Numeric Parallel Algorithms of header hpx/numeric.hpp#

Name

Description

C++ standard

hpx::adjacent_difference

Calculates the difference between each element in an input range and the preceding element.

adjacent_difference

hpx::exclusive_scan

Does an exclusive parallel scan over a range of elements.

exclusive_scan

hpx::inclusive_scan

Does an inclusive parallel scan over a range of elements.

inclusive_scan

hpx::reduce

Sums up a range of elements.

reduce

hpx::transform_exclusive_scan

Does an exclusive parallel scan over a range of elements after applying a function.

transform_exclusive_scan

hpx::transform_inclusive_scan

Does an inclusive parallel scan over a range of elements after applying a function.

transform_inclusive_scan

hpx::transform_reduce

Sums up a range of elements after applying a function. Also, accumulates the inner products of two input ranges.

transform_reduce


Table 20 Dynamic Memory Management of header hpx/memory.hpp#

Name

Description

C++ standard

hpx::destroy

Destroys a range of objects.

destroy

hpx::destroy_n

Destroys a range of objects.

destroy_n

hpx::uninitialized_copy

Copies a range of objects to an uninitialized area of memory.

uninitialized_copy

hpx::uninitialized_copy_n

Copies a number of objects to an uninitialized area of memory.

uninitialized_copy_n

hpx::uninitialized_default_construct

Copies a range of objects to an uninitialized area of memory.

uninitialized_default_construct

hpx::uninitialized_default_construct_n

Copies a number of objects to an uninitialized area of memory.

uninitialized_default_construct_n

hpx::uninitialized_fill

Copies an object to an uninitialized area of memory.

uninitialized_fill

hpx::uninitialized_fill_n

Copies an object to an uninitialized area of memory.

uninitialized_fill_n

hpx::uninitialized_move

Moves a range of objects to an uninitialized area of memory.

uninitialized_move

hpx::uninitialized_move_n

Moves a number of objects to an uninitialized area of memory.

uninitialized_move_n

hpx::uninitialized_value_construct

Constructs objects in an uninitialized area of memory.

uninitialized_value_construct

hpx::uninitialized_value_construct_n

Constructs objects in an uninitialized area of memory.

uninitialized_value_construct_n


Table 21 Index-based for-loops of header hpx/algorithm.hpp#

Name

Description

hpx::experimental::for_loop

Implements loop functionality over a range specified by integral or iterator bounds.

hpx::experimental::for_loop_strided

Implements loop functionality over a range specified by integral or iterator bounds.

hpx::experimental::for_loop_n

Implements loop functionality over a range specified by integral or iterator bounds.

hpx::experimental::for_loop_n_strided

Implements loop functionality over a range specified by integral or iterator bounds.

Executor parameters and executor parameter traits#

HPX introduces the notion of execution parameters and execution parameter traits. At this point, the only parameter that can be customized is the size of the chunks of work executed on a single HPX thread (such as the number of loop iterations combined to run as a single task).

An executor parameter object is responsible for exposing the calculation of the size of the chunks scheduled. It abstracts the (potentially platform-specific) algorithms of determining those chunk sizes.

The way executor parameters are implemented is aligned with the way executors are implemented. All functionalities of concrete executor parameter types are exposed and accessible through a corresponding customization point, e.g. get_chunk_size().

With executor_parameter_traits, clients access all types of executor parameters uniformly, e.g.:

std::size_t chunk_size =
    hpx::execution::experimental::get_chunk_size(my_parameter, my_executor,
        num_cores, num_tasks);

This call synchronously retrieves the size of a single chunk of loop iterations (or similar) to combine for execution on a single HPX thread if the overall number of cores num_cores and tasks to schedule is given by num_tasks. The lambda function exposes a means of test-probing the execution of a single iteration for performance measurement purposes. The execution parameter type might dynamically determine the execution time of one or more tasks in order to calculate the chunk size; see hpx::execution::experimental::auto_chunk_size for an example of this executor parameter type.

Other functions in the interface exist to discover whether an executor parameter type should be invoked once (i.e., it returns a static chunk size; see hpx::execution::experimental::static_chunk_size) or whether it should be invoked for each scheduled chunk of work (i.e., it returns a variable chunk size; for an example, see hpx::execution::experimental::guided_chunk_size).

Although this interface appears to require executor parameter type authors to implement all different basic operations, none are required. In practice, all operations have sensible defaults. However, some executor parameter types will naturally specialize all operations for maximum efficiency.

HPX implements the following executor parameter types:

  • hpx::execution::experimental::auto_chunk_size: Loop iterations are divided into pieces and then assigned to threads. The number of loop iterations combined is determined based on measurements of how long the execution of 1% of the overall number of iterations takes. This executor parameter type makes sure that as many loop iterations are combined as necessary to run for the amount of time specified.

  • hpx::execution::experimental::static_chunk_size: Loop iterations are divided into pieces of a given size and then assigned to threads. If the size is not specified, the iterations are, if possible, evenly divided contiguously among the threads. This executor parameters type is equivalent to OpenMP’s STATIC scheduling directive.

  • hpx::execution::experimental::dynamic_chunk_size: Loop iterations are divided into pieces of a given size and then dynamically scheduled among the cores; when a core finishes one chunk, it is dynamically assigned another. If the size is not specified, the default chunk size is 1. This executor parameter type is equivalent to OpenMP’s DYNAMIC scheduling directive.

  • hpx::execution::experimental::guided_chunk_size: Iterations are dynamically assigned to cores in blocks as cores request them until no blocks remain to be assigned. This is similar to dynamic_chunk_size except that the block size decreases each time a number of loop iterations is given to a thread. The size of the initial block is proportional to number_of_iterations / number_of_cores. Subsequent blocks are proportional to number_of_iterations_remaining / number_of_cores. The optional chunk size parameter defines the minimum block size. The default minimal chunk size is 1. This executor parameter type is equivalent to OpenMP’s GUIDED scheduling directive.

Writing distributed applications#

This section focuses on the features of HPX needed to write distributed applications, namely the Active Global Address Space (AGAS), remotely executable functions (i.e., actions), and distributed objects (i.e., components).

Global names#

HPX implements an Active Global Address Space (AGAS) which exposes a single uniform address space spanning all localities an application runs on. AGAS is a fundamental component of the ParalleX execution model. Conceptually, there is no rigid demarcation of local or global memory in AGAS; all available memory is a part of the same address space. AGAS enables named objects to be moved (migrated) across localities without having to change the object’s name; i.e., no references to migrated objects have to be ever updated. This feature has significance for dynamic load balancing and in applications where the workflow is highly dynamic, allowing work to be migrated from heavily loaded nodes to less loaded nodes. In addition, immutability of names ensures that AGAS does not have to keep extra indirections (“bread crumbs”) when objects move, hence, minimizing complexity of code management for system developers as well as minimizing overheads in maintaining and managing aliases.

The AGAS implementation in HPX does not automatically expose every local address to the global address space. It is the responsibility of the programmer to explicitly define which of the objects have to be globally visible and which of the objects are purely local.

In HPX global addresses (global names) are represented using the hpx::id_type data type. This data type is conceptually very similar to void* pointers as it does not expose any type information of the object it is referring to.

The only predefined global addresses are assigned to all localities. The following HPX API functions allow one to retrieve the global addresses of localities:

Additionally, the global addresses of localities can be used to create new instances of components using the following HPX API function:

  • hpx::components::new_: Creates a new instance of the given Component type on the specified locality.

Note

HPX does not expose any functionality to delete component instances. All global addresses (as represented using hpx::id_type) are automatically garbage collected. When the last (global) reference to a particular component instance goes out of scope, the corresponding component instance is automatically deleted.

Posting actions#
Action type definition#

Actions are special types used to describe possibly remote operations. For every global function and every member function which has to be invoked distantly, a special type must be defined. For any global function the special macro HPX_PLAIN_ACTION can be used to define the action type. Here is an example demonstrating this:

namespace app
{
    void some_global_function(double d)
    {
        cout << d;
    }
}

// This will define the action type 'some_global_action' which represents
// the function 'app::some_global_function'.
HPX_PLAIN_ACTION(app::some_global_function, some_global_action);

Important

The macro HPX_PLAIN_ACTION has to be placed in global namespace, even if the wrapped function is located in some other namespace. The newly defined action type is placed in the global namespace as well.

If the action type should be defined somewhere not in global namespace, the action type definition has to be split into two macro invocations (HPX_DEFINE_PLAIN_ACTION and HPX_REGISTER_ACTION) as shown in the next example:

namespace app
{
    void some_global_function(double d)
    {
        cout << d;
    }

    // On conforming compilers the following macro expands to:
    //
    //    typedef hpx::actions::make_action<
    //        decltype(&some_global_function), &some_global_function
    //    >::type some_global_action;
    //
    // This will define the action type 'some_global_action' which represents
    // the function 'some_global_function'.
    HPX_DEFINE_PLAIN_ACTION(some_global_function, some_global_action);
}

// The following macro expands to a series of definitions of global objects
// which are needed for proper serialization and initialization support
// enabling the remote invocation of the function``some_global_function``
HPX_REGISTER_ACTION(app::some_global_action, app_some_global_action);

The shown code defines an action type some_global_action inside the namespace app.

Important

If the action type definition is split between two macros as shown above, the name of the action type to create has to be the same for both macro invocations (here some_global_action).

Important

The second argument passed to HPX_REGISTER_ACTION (app_some_global_action) has to comprise a globally unique C++ identifier representing the action. This is used for serialization purposes.

For member functions of objects which have been registered with AGAS (e.g., ‘components’), a different registration macro HPX_DEFINE_COMPONENT_ACTION has to be utilized. Any component needs to be declared in a header file and have some special support macros defined in a source file. Here is an example demonstrating this. The first snippet has to go into the header file:

namespace app
{
    struct some_component
      : hpx::components::component_base<some_component>
    {
        int some_member_function(std::string s)
        {
            return boost::lexical_cast<int>(s);
        }

        // This will define the action type 'some_member_action' which
        // represents the member function 'some_member_function' of the
        // object type 'some_component'.
        HPX_DEFINE_COMPONENT_ACTION(some_component, some_member_function,
            some_member_action);
    };
}

// Note: The second argument to the macro below has to be systemwide-unique
//       C++ identifiers
HPX_REGISTER_ACTION_DECLARATION(app::some_component::some_member_action, some_component_some_action);

The next snippet belongs in a source file (e.g., the main application source file) in the simplest case:

typedef hpx::components::component<app::some_component> component_type;
typedef app::some_component some_component;

HPX_REGISTER_COMPONENT(component_type, some_component);

// The parameters for this macro have to be the same as used in the corresponding
// HPX_REGISTER_ACTION_DECLARATION() macro invocation above
typedef some_component::some_member_action some_component_some_action;
HPX_REGISTER_ACTION(some_component_some_action);

While these macro invocations are a bit more complex than those for simple global functions, they should still be manageable.

The most important macro invocation is the HPX_DEFINE_COMPONENT_ACTION in the header file as this defines the action type we need to invoke the member function. For a complete example of a simple component action see component_in_executable.cpp.

Action invocation#

The process of invoking a global function (or a member function of an object) with the help of the associated action is called ‘posting the action’. Actions can have arguments, which will be supplied while the action is applied. At the minimum, one parameter is required to post any action - the id of the locality the associated function should be invoked on (for global functions), or the id of the component instance (for member functions). Generally, HPX provides several ways to post an action, all of which are described in the following sections.

Generally, HPX actions are very similar to ‘normal’ C++ functions except that actions can be invoked remotely. Fig. 8 below shows an overview of the main API exposed by HPX. This shows the function invocation syntax as defined by the C++ language (dark gray), the additional invocation syntax as provided through C++ Standard Library features (medium gray), and the extensions added by HPX (light gray) where:

  • f function to invoke,

  • p..: (optional) arguments,

  • R: return type of f,

  • action: action type defined by, HPX_DEFINE_PLAIN_ACTION or HPX_DEFINE_COMPONENT_ACTION encapsulating f,

  • a: an instance of the type action,

  • id: the global address the action is applied to.

_images/hpx_the_api.png

Fig. 8 Overview of the main API exposed by HPX.#

This figure shows that HPX allows the user to post actions with a syntax similar to the C++ standard. In fact, all action types have an overloaded function operator allowing to synchronously post the action. Further, HPX implements hpx::async which semantically works similar to the way std::async works for plain C++ function.

Note

The similarity of posting an action to conventional function invocations extends even further. HPX implements hpx::bind and hpx::function two facilities which are semantically equivalent to the std::bind and std::function types as defined by the C++11 Standard. While hpx::async extends beyond the conventional semantics by supporting actions and conventional C++ functions, the HPX facilities hpx::bind and hpx::function extend beyond the conventional standard facilities too. The HPX facilities not only support conventional functions, but can be used for actions as well.

Additionally, HPX exposes hpx::post and hpx::async_continue both of which refine and extend the standard C++ facilities.

The different ways to invoke a function in HPX will be explained in more detail in the following sections.

Posting an action asynchronously without any synchronization#

This method (‘fire and forget’) will make sure the function associated with the action is scheduled to run on the target locality. Posting the action does not wait for the function to start running, instead it is a fully asynchronous operation. The following example shows how to post the action as defined in the previous section on the local locality (the locality this code runs on):

some_global_action act;     // define an instance of some_global_action
hpx::post(act, hpx::find_here(), 2.0);

(the function hpx::find_here() returns the id of the local locality, i.e. the locality this code executes on).

Any component member function can be invoked using the same syntactic construct. Given that id is the global address for a component instance created earlier, this invocation looks like:

some_component_action act;     // define an instance of some_component_action
hpx::post(act, id, "42");

In this case any value returned from this action (e.g. in this case the integer 42 is ignored. Please look at Action type definition for the code defining the component action some_component_action used.

Posting an action asynchronously with synchronization#

This method will make sure the action is scheduled to run on the target locality. Posting the action itself does not wait for the function to start running or to complete, instead this is a fully asynchronous operation similar to using hpx::post as described above. The difference is that this method will return an instance of a hpx::future<> encapsulating the result of the (possibly remote) execution. The future can be used to synchronize with the asynchronous operation. The following example shows how to post the action from above on the local locality:

some_global_action act;     // define an instance of some_global_action
hpx::future<void> f = hpx::async(act, hpx::find_here(), 2.0);
//
// ... other code can be executed here
//
f.get();    // this will possibly wait for the asynchronous operation to 'return'

(as before, the function hpx::find_here() returns the id of the local locality (the locality this code is executed on).

Note

The use of a hpx::future<void> allows the current thread to synchronize with any remote operation not returning any value.

Note

Any std::future<> returned from std::async() is required to block in its destructor if the value has not been set for this future yet. This is not true for hpx::future<> which will never block in its destructor, even if the value has not been returned to the future yet. We believe that consistency in the behavior of futures is more important than standards conformance in this case.

Any component member function can be invoked using the same syntactic construct. Given that id is the global address for a component instance created earlier, this invocation looks like:

some_component_action act;     // define an instance of some_component_action
hpx::future<int> f = hpx::async(act, id, "42");
//
// ... other code can be executed here
//
cout << f.get();    // this will possibly wait for the asynchronous operation to 'return' 42

Note

The invocation of f.get() will return the result immediately (without suspending the calling thread) if the result from the asynchronous operation has already been returned. Otherwise, the invocation of f.get() will suspend the execution of the calling thread until the asynchronous operation returns its result.

Posting an action synchronously#

This method will schedule the function wrapped in the specified action on the target locality. While the invocation appears to be synchronous (as we will see), the calling thread will be suspended while waiting for the function to return. Invoking a plain action (e.g. a global function) synchronously is straightforward:

some_global_action act;     // define an instance of some_global_action
act(hpx::find_here(), 2.0);

While this call looks just like a normal synchronous function invocation, the function wrapped by the action will be scheduled to run on a new thread and the calling thread will be suspended. After the new thread has executed the wrapped global function, the waiting thread will resume and return from the synchronous call.

Equivalently, any action wrapping a component member function can be invoked synchronously as follows:

some_component_action act;     // define an instance of some_component_action
int result = act(id, "42");

The action invocation will either schedule a new thread locally to execute the wrapped member function (as before, id is the global address of the component instance the member function should be invoked on), or it will send a parcel to the remote locality of the component causing a new thread to be scheduled there. The calling thread will be suspended until the function returns its result. This result will be returned from the synchronous action invocation.

It is very important to understand that this ‘synchronous’ invocation syntax in fact conceals an asynchronous function call. This is beneficial as the calling thread is suspended while waiting for the outcome of a potentially remote operation. The HPX thread scheduler will schedule other work in the meantime, allowing the application to make further progress while the remote result is computed. This helps overlapping computation with communication and hiding communication latencies.

Note

The syntax of posting an action is always the same, regardless whether the target locality is remote to the invocation locality or not. This is a very important feature of HPX as it frees the user from the task of keeping track what actions have to be applied locally and which actions are remote. If the target for posting an action is local, a new thread is automatically created and scheduled. Once this thread is scheduled and run, it will execute the function encapsulated by that action. If the target is remote, HPX will send a parcel to the remote locality which encapsulates the action and its parameters. Once the parcel is received on the remote locality HPX will create and schedule a new thread there. Once this thread runs on the remote locality, it will execute the function encapsulated by the action.

Posting an action with a continuation but without any synchronization#

This method is very similar to the method described in section Posting an action asynchronously without any synchronization. The difference is that it allows the user to chain a sequence of asynchronous operations, while handing the (intermediate) results from one step to the next step in the chain. Where hpx::post invokes a single function using ‘fire and forget’ semantics, hpx::post_continue asynchronously triggers a chain of functions without the need for the execution flow ‘to come back’ to the invocation site. Each of the asynchronous functions can be executed on a different locality.

Posting an action with a continuation and with synchronization#

This method is very similar to the method described in section Posting an action asynchronously with synchronization. In addition to what hpx::async can do, the functions hpx::async_continue takes an additional function argument. This function will be called as the continuation of the executed action. It is expected to perform additional operations and to make sure that a result is returned to the original invocation site. This method chains operations asynchronously by providing a continuation operation which is automatically executed once the first action has finished executing.

As an example we chain two actions, where the result of the first action is forwarded to the second action and the result of the second action is sent back to the original invocation site:

// first action
std::int32_t action1(std::int32_t i)
{
    return i+1;
}
HPX_PLAIN_ACTION(action1);    // defines action1_type

// second action
std::int32_t action2(std::int32_t i)
{
    return i*2;
}
HPX_PLAIN_ACTION(action2);    // defines action2_type

// this code invokes 'action1' above and passes along a continuation
// function which will forward the result returned from 'action1' to
// 'action2'.
action1_type act1;     // define an instance of 'action1_type'
action2_type act2;     // define an instance of 'action2_type'
hpx::future<int> f =
    hpx::async_continue(act1, hpx::make_continuation(act2),
        hpx::find_here(), 42);
hpx::cout << f.get() << "\n";   // will print: 86 ((42 + 1) * 2)

By default, the continuation is executed on the same locality as hpx::async_continue is invoked from. If you want to specify the locality where the continuation should be executed, the code above has to be written as:

// this code invokes 'action1' above and passes along a continuation
// function which will forward the result returned from 'action1' to
// 'action2'.
action1_type act1;     // define an instance of 'action1_type'
action2_type act2;     // define an instance of 'action2_type'
hpx::future<int> f =
    hpx::async_continue(act1, hpx::make_continuation(act2, hpx::find_here()),
        hpx::find_here(), 42);
hpx::cout << f.get() << "\n";   // will print: 86 ((42 + 1) * 2)

Similarly, it is possible to chain more than 2 operations:

action1_type act1;     // define an instance of 'action1_type'
action2_type act2;     // define an instance of 'action2_type'
hpx::future<int> f =
    hpx::async_continue(act1,
        hpx::make_continuation(act2, hpx::make_continuation(act1)),
        hpx::find_here(), 42);
hpx::cout << f.get() << "\n";   // will print: 87 ((42 + 1) * 2 + 1)

The function hpx::make_continuation creates a special function object which exposes the following prototype:

struct continuation
{
    template <typename Result>
    void operator()(hpx::id_type id, Result&& result) const
    {
        ...
    }
};

where the parameters passed to the overloaded function operator operator()() are:

  • the id is the global id where the final result of the asynchronous chain of operations should be sent to (in most cases this is the id of the hpx::future returned from the initial call to hpx::async_continue. Any custom continuation function should make sure this id is forwarded to the last operation in the chain.

  • the result is the result value of the current operation in the asynchronous execution chain. This value needs to be forwarded to the next operation.

Note

All of those operations are implemented by the predefined continuation function object which is returned from hpx::make_continuation. Any (custom) function object used as a continuation should conform to the same interface.

Action error handling#

Like in any other asynchronous invocation scheme it is important to be able to handle error conditions occurring while the asynchronous (and possibly remote) operation is executed. In HPX all error handling is based on standard C++ exception handling. Any exception thrown during the execution of an asynchronous operation will be transferred back to the original invocation locality, where it is rethrown during synchronization with the calling thread.

Important

Exceptions thrown during asynchronous execution can be transferred back to the invoking thread only for the synchronous and the asynchronous case with synchronization. Like with any other unhandled exception, any exception thrown during the execution of an asynchronous action without synchronization will result in calling hpx::terminate causing the running application to exit immediately.

Note

Even if error handling internally relies on exceptions, most of the API functions exposed by HPX can be used without throwing an exception. Please see Working with exceptions for more information.

As an example, we will assume that the following remote function will be executed:

namespace app
{
    void some_function_with_error(int arg)
    {
        if (arg < 0) {
            HPX_THROW_EXCEPTION(hpx::error::bad_parameter,
                "some_function_with_error",
                "some really bad error happened");
        }
        // do something else...
    }
}

// This will define the action type 'some_error_action' which represents
// the function 'app::some_function_with_error'.
HPX_PLAIN_ACTION(app::some_function_with_error, some_error_action);

The use of HPX_THROW_EXCEPTION to report the error encapsulates the creation of a hpx::exception which is initialized with the error code hpx::error::bad_parameter. Additionally it carries the passed strings, the information about the file name, line number, and call stack of the point the exception was thrown from.

We invoke this action using the synchronous syntax as described before:

// note: wrapped function will throw hpx::exception
some_error_action act;            // define an instance of some_error_action
try {
    act(hpx::find_here(), -3);    // exception will be rethrown from here
}
catch (hpx::exception const& e) {
    // prints: 'some really bad error happened: HPX(bad parameter)'
    cout << e.what();
}

If this action is invoked asynchronously with synchronization, the exception is propagated to the waiting thread as well and is re-thrown from the future’s function get():

// note: wrapped function will throw hpx::exception
some_error_action act;            // define an instance of some_error_action
hpx::future<void> f = hpx::async(act, hpx::find_here(), -3);
try {
    f.get();                      // exception will be rethrown from here
}
catch (hpx::exception const& e) {
    // prints: 'some really bad error happened: HPX(bad parameter)'
    cout << e.what();
}

For more information about error handling please refer to the section Working with exceptions. There we also explain how to handle error conditions without having to rely on exception.

Writing components#

A component in HPX is a C++ class which can be created remotely and for which its member functions can be invoked remotely as well. The following sections highlight how components can be defined, created, and used.

Defining components#

In order for a C++ class type to be managed remotely in HPX, the type must be derived from the hpx::components::component_base template type. We call such C++ class types ‘components’.

Note that the component type itself is passed as a template argument to the base class:

// header file some_component.hpp

#include <hpx/include/components.hpp>

namespace app
{
    // Define a new component type 'some_component'
    struct some_component
      : hpx::components::component_base<some_component>
    {
        // This member function is has to be invoked remotely
        int some_member_function(std::string const& s)
        {
            return boost::lexical_cast<int>(s);
        }

        // This will define the action type 'some_member_action' which
        // represents the member function 'some_member_function' of the
        // object type 'some_component'.
        HPX_DEFINE_COMPONENT_ACTION(some_component, some_member_function, some_member_action);
    };
}

// This will generate the necessary boiler-plate code for the action allowing
// it to be invoked remotely. This declaration macro has to be placed in the
// header file defining the component itself.
//
// Note: The second argument to the macro below has to be systemwide-unique
//       C++ identifiers
//
HPX_REGISTER_ACTION_DECLARATION(app::some_component::some_member_action, some_component_some_action);

There is more boiler plate code which has to be placed into a source file in order for the component to be usable. Every component type is required to have macros placed into its source file, one for each component type and one macro for each of the actions defined by the component type.

For instance:

// source file some_component.cpp

#include "some_component.hpp"

// The following code generates all necessary boiler plate to enable the
// remote creation of 'app::some_component' instances with 'hpx::new_<>()'
//
using some_component = app::some_component;
using some_component_type = hpx::components::component<some_component>;

// Please note that the second argument to this macro must be a
// (system-wide) unique C++-style identifier (without any namespaces)
//
HPX_REGISTER_COMPONENT(some_component_type, some_component);

// The parameters for this macro have to be the same as used in the corresponding
// HPX_REGISTER_ACTION_DECLARATION() macro invocation in the corresponding
// header file.
//
// Please note that the second argument to this macro must be a
// (system-wide) unique C++-style identifier (without any namespaces)
//
HPX_REGISTER_ACTION(app::some_component::some_member_action, some_component_some_action);
Defining client side representation classes#

Often it is very convenient to define a separate type for a component which can be used on the client side (from where the component is instantiated and used). This step might seem as unnecessary duplicating code, however it significantly increases the type safety of the code.

A possible implementation of such a client side representation for the component described in the previous section could look like:

#include <hpx/include/components.hpp>

namespace app
{
    // Define a client side representation type for the component type
    // 'some_component' defined in the previous section.
    //
    struct some_component_client
      : hpx::components::client_base<some_component_client, some_component>
    {
        using base_type = hpx::components::client_base<
                some_component_client, some_component>;

        some_component_client(hpx::future<hpx::id_type> && id)
          : base_type(std::move(id))
        {}

        hpx::future<int> some_member_function(std::string const& s)
        {
            some_component::some_member_action act;
            return hpx::async(act, get_id(), s);
        }
    };
}

A client side object stores the global id of the component instance it represents. This global id is accessible by calling the function client_base<>::get_id(). The special constructor which is provided in the example allows to create this client side object directly using the API function hpx::new_.

Creating component instances#

Instances of defined component types can be created in two different ways. If the component to create has a defined client side representation type, then this can be used, otherwise use the server type.

The following examples assume that some_component_type is the type of the server side implementation of the component to create. All additional arguments (see , ... notation below) are passed through to the corresponding constructor calls of those objects:

// create one instance on the given locality
hpx::id_type here = hpx::find_here();
hpx::future<hpx::id_type> f =
    hpx::new_<some_component_type>(here, ...);

// create one instance using the given distribution
// policy (here: hpx::colocating_distribution_policy)
hpx::id_type here = hpx::find_here();
hpx::future<hpx::id_type> f =
    hpx::new_<some_component_type>(hpx::colocated(here), ...);

// create multiple instances on the given locality
hpx::id_type here = find_here();
hpx::future<std::vector<hpx::id_type>> f =
    hpx::new_<some_component_type[]>(here, num, ...);

// create multiple instances using the given distribution
// policy (here: hpx::binpacking_distribution_policy)
hpx::future<std::vector<hpx::id_type>> f = hpx::new_<some_component_type[]>(
    hpx::binpacking(hpx::find_all_localities()), num, ...);

The examples below demonstrate the use of the same API functions for creating client side representation objects (instead of just plain ids). These examples assume that client_type is the type of the client side representation of the component type to create. As above, all additional arguments (see , ... notation below) are passed through to the corresponding constructor calls of the server side implementation objects corresponding to the client_type:

// create one instance on the given locality
hpx::id_type here = hpx::find_here();
client_type c = hpx::new_<client_type>(here, ...);

// create one instance using the given distribution
// policy (here: hpx::colocating_distribution_policy)
hpx::id_type here = hpx::find_here();
client_type c = hpx::new_<client_type>(hpx::colocated(here), ...);

// create multiple instances on the given locality
hpx::id_type here = hpx::find_here();
hpx::future<std::vector<client_type>> f =
    hpx::new_<client_type[]>(here, num, ...);

// create multiple instances using the given distribution
// policy (here: hpx::binpacking_distribution_policy)
hpx::future<std::vector<client_type>> f = hpx::new_<client_type[]>(
    hpx::binpacking(hpx::find_all_localities()), num, ...);
Using component instances#

After having created the component instances as described above, we can simply use them as indicated below:

#include <hpx/include/components.hpp>
#include <iostream>
#include <vector>

// Define a simple component
struct some_component : hpx::components::component_base<some_component>
{
    void print() const
    {
        std::cout << "Hello from component instance!" << std::endl;
    }
    HPX_DEFINE_COMPONENT_ACTION(some_component, print, print_action);
};

typedef some_component::print_action print_action;

// Create one instance on the given locality
hpx::id_type here = hpx::find_here();
hpx::future<hpx::id_type> f1 =
    hpx::new_<some_component>(here);

// Get the future value
hpx::id_type instance_id = f1.get();

// Invoke action on the instance
hpx::async<print_action>(instance_id).get();

// Create multiple instances on the given locality
int num = 3;
hpx::future<std::vector<hpx::id_type>> f2 =
    hpx::new_<some_component[]>(here, num);

// Get the future value
std::vector<hpx::id_type> instance_ids = f2.get();

// Invoke action on each instance
for (const auto& id : instance_ids)
{
    hpx::async<print_action>(id).get();
}

We can use the component instances with distribution policies the same way.

Segmented containers#

In parallel programming, there is now a plethora of solutions aimed at implementing “partially contiguous” or segmented data structures, whether on shared memory systems or distributed memory systems. HPX implements such structures by drawing inspiration from Standard C++ containers.

Using segmented containers#

A segmented container is a template class that is described in the namespace hpx. All segmented containers are very similar semantically to their sequential counterpart (defined in namespace std but with an additional template parameter named DistPolicy). The distribution policy is an optional parameter that is passed last to the segmented container constructor (after the container size when no default value is given, after the default value if not). The distribution policy describes the manner in which a container is segmented and the placement of each segment among the available runtime localities.

However, only a part of the std container member functions were reimplemented:

  • (constructor), (destructor), operator=

  • operator[]

  • begin, cbegin, end, cend

  • size

An example of how to use the partitioned_vector container would be:

#include <hpx/include/partitioned_vector.hpp>

// The following code generates all necessary boiler plate to enable the
// remote creation of 'partitioned_vector' segments
//
HPX_REGISTER_PARTITIONED_VECTOR(double);

// By default, the number of segments is equal to the current number of
// localities
//
hpx::partitioned_vector<double> va(50);
hpx::partitioned_vector<double> vb(50, 0.0);

An example of how to use the partitioned_vector container with distribution policies would be:

#include <hpx/include/partitioned_vector.hpp>
#include <hpx/runtime_distributed/find_localities.hpp>

// The following code generates all necessary boiler plate to enable the
// remote creation of 'partitioned_vector' segments
//
HPX_REGISTER_PARTITIONED_VECTOR(double);

std::size_t num_segments = 10;
std::vector<hpx::id_type> locs = hpx::find_all_localities();

auto layout =
        hpx::container_layout( num_segments, locs );

// The number of segments is 10 and those segments are spread across the
// localities collected in the variable locs in a Round-Robin manner
//
hpx::partitioned_vector<double> va(50, layout);
hpx::partitioned_vector<double> vb(50, 0.0, layout);

By definition, a segmented container must be accessible from any thread although its construction is synchronous only for the thread who has called its constructor. To overcome this problem, it is possible to assign a symbolic name to the segmented container:

#include <hpx/include/partitioned_vector.hpp>

// The following code generates all necessary boiler plate to enable the
// remote creation of 'partitioned_vector' segments
//
HPX_REGISTER_PARTITIONED_VECTOR(double);

hpx::future<void> fserver = hpx::async(
  [](){
    hpx::partitioned_vector<double> v(50);

    // Register the 'partitioned_vector' with the name "some_name"
    //
    v.register_as("some_name");

    /* Do some code  */
  });

hpx::future<void> fclient =
  hpx::async(
    [](){
      // Naked 'partitioned_vector'
      //
      hpx::partitioned_vector<double> v;

      // Now the variable v points to the same 'partitioned_vector' that has
      // been registered with the name "some_name"
      //
      v.connect_to("some_name");

      /* Do some code  */
    });
Segmented containers#

HPX provides the following segmented containers:

Table 22 Sequence containers#

Name

Description

In header

C++ standard

hpx::partitioned_vector

Dynamic segmented contiguous array.

<hpx/include/partitioned_vector.hpp>

vector

Table 23 Unordered associative containers#

Name

Description

In header

C++ standard

hpx::unordered_map

Segmented collection of key-value pairs, hashed by keys, keys are unique.

<hpx/include/unordered_map.hpp>

unordered_map

Segmented iterators and segmented iterator traits#

The basic iterator used in the STL library is only suitable for one-dimensional structures. The iterators we use in HPX must adapt to the segmented format of our containers. Our iterators are then able to know when incrementing themselves if the next element of type T is in the same data segment or in another segment. In this second case, the iterator will automatically point to the beginning of the next segment.

Note

Note that the dereference operation operator * does not directly return a reference of type T& but an intermediate object wrapping this reference. When this object is used as an l-value, a remote write operation is performed; When this object is used as an r-value, implicit conversion to T type will take care of performing remote read operation.

It is sometimes useful not only to iterate element by element, but also segment by segment, or simply get a local iterator in order to avoid additional construction costs at each deferencing operations. To mitigate this need, the hpx::traits::segmented_iterator_traits are used.

With segmented_iterator_traits users can uniformly get the iterators which specifically iterates over segments (by providing a segmented iterator as a parameter), or get the local begin/end iterators of the nearest local segment (by providing a per-segment iterator as a parameter):

#include <hpx/include/partitioned_vector.hpp>

// The following code generates all necessary boiler plate to enable the
// remote creation of 'partitioned_vector' segments
//
HPX_REGISTER_PARTITIONED_VECTOR(double);

using iterator = hpx::partitioned_vector<T>::iterator;
using traits   = hpx::traits::segmented_iterator_traits<iterator>;

hpx::partitioned_vector<T> v;
std::size_t count = 0;

auto seg_begin = traits::segment(v.begin());
auto seg_end   = traits::segment(v.end());

// Iterate over segments
for (auto seg_it = seg_begin; seg_it != seg_end; ++seg_it)
{
    auto loc_begin = traits::begin(seg_it);
    auto loc_end   = traits::end(seg_it);

    // Iterate over elements inside segments
    for (auto lit = loc_begin; lit != loc_end; ++lit, ++count)
    {
        *lit = count;
    }
}

Which is equivalent to:

hpx::partitioned_vector<T> v;
std::size_t count = 0;

auto begin = v.begin();
auto end   = v.end();

for (auto it = begin; it != end; ++it, ++count)
{
    *it = count;
}
Using views#

The use of multidimensional arrays is quite common in the numerical field whether to perform dense matrix operations or to process images. It exist many libraries which implement such object classes overloading their basic operators (e.g. +, -, *, (), etc.). However, such operation becomes more delicate when the underlying data layout is segmented or when it is mandatory to use optimized linear algebra subroutines (i.e. BLAS subroutines).

Our solution is thus to relax the level of abstraction by allowing the user to work not directly on n-dimensionnal data, but on “n-dimensionnal collections of 1-D arrays”. The use of well-accepted techniques on contiguous data is thus preserved at the segment level, and the composability of the segments is made possible thanks to multidimensional array-inspired access mode.

Preface: Why SPMD?#

Although HPX refutes by design this programming model, the locality plays a dominant role when it comes to implement vectorized code. To maximize local computations and avoid unneeded data transfers, a parallel section (or Single Programming Multiple Data section) is required. Because the use of global variables is prohibited, this parallel section is created via the RAII idiom.

To define a parallel section, simply write an action taking a spmd_block variable as a first parameter:

#include <hpx/collectives/spmd_block.hpp>

void bulk_function(hpx::lcos::spmd_block block /* , arg0, arg1, ... */)
{
    // Parallel section

    /* Do some code */
}
HPX_PLAIN_ACTION(bulk_function, bulk_action);

Note

In the following paragraphs, we will use the term “image” several times. An image is defined as a lightweight process whose entry point is a function provided by the user. It’s an “image of the function”.

The spmd_block class contains the following methods:

  • Team information: get_num_images, this_image, images_per_locality

  • Control statements: sync_all, sync_images

Here is a sample code summarizing the features offered by the spmd_block class:

#include <hpx/collectives/spmd_block.hpp>

void bulk_function(hpx::lcos::spmd_block block /* , arg0, arg1, ... */)
{
    std::size_t num_images = block.get_num_images();
    std::size_t this_image = block.this_image();
    std::size_t images_per_locality = block.images_per_locality();

    /* Do some code */

    // Synchronize all images in the team
    block.sync_all();

    /* Do some code */

    // Synchronize image 0 and image 1
    block.sync_images(0,1);

    /* Do some code */

    std::vector<std::size_t> vec_images = {2,3,4};

    // Synchronize images 2, 3 and 4
    block.sync_images(vec_images);

    // Alternative call to synchronize images 2, 3 and 4
    block.sync_images(vec_images.begin(), vec_images.end());

    /* Do some code */

    // Non-blocking version of sync_all()
    hpx::future<void> event =
        block.sync_all(hpx::launch::async);

    // Callback waiting for 'event' to be ready before being scheduled
    hpx::future<void> cb =
        event.then(
          [](hpx::future<void>)
          {

            /* Do some code */

          });

    // Finally wait for the execution tree to be finished
    cb.get();
}
HPX_PLAIN_ACTION(bulk_test_function, bulk_test_action);

Then, in order to invoke the parallel section, call the function define_spmd_block specifying an arbitrary symbolic name and indicating the number of images per locality to create:

void bulk_function(hpx::lcos::spmd_block block, /* , arg0, arg1, ... */)
{

}
HPX_PLAIN_ACTION(bulk_test_function, bulk_test_action);

int main()
{
    /* std::size_t arg0, arg1, ...; */

    bulk_action act;
    std::size_t images_per_locality = 4;

    // Instantiate the parallel section
    hpx::lcos::define_spmd_block(
        "some_name", images_per_locality, std::move(act) /*, arg0, arg1, ... */);

    return 0;
}

Note

In principle, the user should never call the spmd_block constructor. The define_spmd_block function is responsible of instantiating spmd_block objects and broadcasting them to each created image.

SPMD multidimensional views#

Some classes are defined as “container views” when the purpose is to observe and/or modify the values of a container using another perspective than the one that characterizes the container. For example, the values of an std::vector object can be accessed via the expression [i]. Container views can be used, for example, when it is desired for those values to be “viewed” as a 2D matrix that would have been flattened in a std::vector. The values would be possibly accessible via the expression vv(i,j) which would call internally the expression v[k].

By default, the partitioned_vector class integrates 1-D views of its segments:

#include <hpx/include/partitioned_vector.hpp>

// The following code generates all necessary boiler plate to enable the
// remote creation of 'partitioned_vector' segments
//
HPX_REGISTER_PARTITIONED_VECTOR(double);

using iterator = hpx::partitioned_vector<double>::iterator;
using traits   = hpx::traits::segmented_iterator_traits<iterator>;

hpx::partitioned_vector<double> v;

// Create a 1-D view of the vector of segments
auto vv = traits::segment(v.begin());

// Access segment i
std::vector<double> v = vv[i];

Our views are called “multidimensional” in the sense that they generalize to N dimensions the purpose of segmented_iterator_traits::segment() in the 1-D case. Note that in a parallel section, the 2-D expression a(i,j) = b(i,j) is quite confusing because without convention, each of the images invoked will race to execute the statement. For this reason, our views are not only multidimensional but also “spmd-aware”.

Note

SPMD-awareness: The convention is simple. If an assignment statement contains a view subscript as an l-value, it is only and only the image holding the r-value who is evaluating the statement. (In MPI sense, it is called a Put operation).

Subscript-based operations#

Here are some examples of using subscripts in the 2-D view case:

#include <hpx/components/containers/partitioned_vector/partitioned_vector_view.hpp>
#include <hpx/include/partitioned_vector.hpp>

// The following code generates all necessary boiler plate to enable the
// remote creation of 'partitioned_vector' segments
//
HPX_REGISTER_PARTITIONED_VECTOR(double);

using Vec = hpx::partitioned_vector<double>;
using View_2D = hpx::partitioned_vector_view<double,2>;

/* Do some code */

Vec v;

// Parallel section (suppose 'block' an spmd_block instance)
{
    std::size_t height, width;

    // Instantiate the view
    View_2D vv(block, v.begin(), v.end(), {height,width});

    // The l-value is a view subscript, the image that owns vv(1,0)
    // evaluates the assignment.
    vv(0,1) = vv(1,0);

    // The l-value is a view subscript, the image that owns the r-value
    // (result of expression 'std::vector<double>(4,1.0)') evaluates the
    // assignment : oops! race between all participating images.
    vv(2,3) = std::vector<double>(4,1.0);
}
Iterator-based operations#

Here are some examples of using iterators in the 3-D view case:

#include <hpx/components/containers/partitioned_vector/partitioned_vector_view.hpp>
#include <hpx/include/partitioned_vector.hpp>

// The following code generates all necessary boiler plate to enable the
// remote creation of 'partitioned_vector' segments
//
HPX_REGISTER_PARTITIONED_VECTOR(int);

using Vec = hpx::partitioned_vector<int>;
using View_3D = hpx::partitioned_vector_view<int,3>;

/* Do some code */

Vec v1, v2;

// Parallel section (suppose 'block' an spmd_block instance)
{
    std::size_t sixe_x, size_y, size_z;

    // Instantiate the views
    View_3D vv1(block, v1.begin(), v1.end(), {sixe_x,size_y,size_z});
    View_3D vv2(block, v2.begin(), v2.end(), {sixe_x,size_y,size_z});

    // Save previous segments covered by vv1 into segments covered by vv2
    auto vv2_it = vv2.begin();
    auto vv1_it = vv1.cbegin();

    for(; vv2_it != vv2.end(); vv2_it++, vv1_it++)
    {
        // It's a Put operation
        *vv2_it = *vv1_it;
    }

    // Ensure that all images have performed their Put operations
    block.sync_all();

    // Ensure that only one image is putting updated data into the different
    // segments covered by vv1
    if(block.this_image() == 0)
    {
        int idx = 0;

        // Update all the segments covered by vv1
        for(auto i = vv1.begin(); i != vv1.end(); i++)
        {
            // It's a Put operation
            *i = std::vector<float>(elt_size,idx++);
        }
    }
}

Here is an example that shows how to iterate only over segments owned by the current image:

#include <hpx/components/containers/partitioned_vector/partitioned_vector_view.hpp>
#include <hpx/components/containers/partitioned_vector/partitioned_vector_local_view.hpp>
#include <hpx/include/partitioned_vector.hpp>

// The following code generates all necessary boiler plate to enable the
// remote creation of 'partitioned_vector' segments
//
HPX_REGISTER_PARTITIONED_VECTOR(float);

using Vec = hpx::partitioned_vector<float>;
using View_1D = hpx::partitioned_vector_view<float,1>;

/* Do some code */

Vec v;

// Parallel section (suppose 'block' an spmd_block instance)
{
    std::size_t num_segments;

    // Instantiate the view
    View_1D vv(block, v.begin(), v.end(), {num_segments});

    // Instantiate the local view from the view
    auto local_vv = hpx::local_view(vv);

    for ( auto i = local_vv.begin(); i != local_vv.end(); i++ )
    {
        std::vector<float> & segment = *i;

        /* Do some code */
    }

}
Instantiating sub-views#

It is possible to construct views from other views: we call it sub-views. The constraint nevertheless for the subviews is to retain the dimension and the value type of the input view. Here is an example showing how to create a sub-view:

#include <hpx/components/containers/partitioned_vector/partitioned_vector_view.hpp>
#include <hpx/include/partitioned_vector.hpp>

// The following code generates all necessary boiler plate to enable the
// remote creation of 'partitioned_vector' segments
//
HPX_REGISTER_PARTITIONED_VECTOR(float);

using Vec = hpx::partitioned_vector<float>;
using View_2D = hpx::partitioned_vector_view<float,2>;

/* Do some code */

Vec v;

// Parallel section (suppose 'block' an spmd_block instance)
{
    std::size_t N = 20;
    std::size_t tilesize = 5;

    // Instantiate the view
    View_2D vv(block, v.begin(), v.end(), {N,N});

    // Instantiate the subview
    View_2D svv(
        block,&vv(tilesize,0),&vv(2*tilesize-1,tilesize-1),{tilesize,tilesize},{N,N});

    if(block.this_image() == 0)
    {
        // Equivalent to 'vv(tilesize,0) = 2.0f'
        svv(0,0) = 2.0f;

        // Equivalent to 'vv(2*tilesize-1,tilesize-1) = 3.0f'
        svv(tilesize-1,tilesize-1) = 3.0f;
    }

}

Note

The last parameter of the subview constructor is the size of the original view. If one would like to create a subview of the subview and so on, this parameter should stay unchanged. {N,N} for the above example).

C++ co-arrays#

Fortran has extended its scalar element indexing approach to reference each segment of a distributed array. In this extension, a segment is attributed a ?co-index? and lives in a specific locality. A co-index provides the application with enough information to retrieve the corresponding data reference. In C++, containers present themselves as a ?smarter? alternative of Fortran arrays but there are still no corresponding standardized features similar to the Fortran co-indexing approach. We present here an implementation of such features in HPX.

Preface: co-array, a segmented container tied to a SPMD multidimensional views#

As mentioned before, a co-array is a distributed array whose segments are accessible through an array-inspired access mode. We have previously seen that it is possible to reproduce such access mode using the concept of views. Nevertheless, the user must pre-create a segmented container to instantiate this view. We illustrate below how a single constructor call can perform those two operations:

#include <hpx/components/containers/coarray/coarray.hpp>
#include <hpx/collectives/spmd_block.hpp>

// The following code generates all necessary boiler plate to enable the
// co-creation of 'coarray'
//
HPX_REGISTER_COARRAY(double);

// Parallel section (suppose 'block' an spmd_block instance)
{
    using hpx::container::placeholders::_;

    std::size_t height=32, width=4, segment_size=10;

    hpx::coarray<double,3> a(block, "a", {height,width,_}, segment_size);

    /* Do some code */
}

Unlike segmented containers, a co-array object can only be instantiated within a parallel section. Here is the description of the parameters to provide to the coarray constructor:

Table 24 Parameters of coarray constructor#

Parameter

Description

block

Reference to a spmd_block object

"a"

Symbolic name of type std::string

{height,width,_}

Dimensions of the coarray object

segment_size

Size of a co-indexed element (i.e. size of the object referenced by the expression a(i,j,k))

Note that the “last dimension size” cannot be set by the user. It only accepts the constexpr variable hpx::container::placeholders::_. This size, which is considered private, is equal to the number of current images (value returned by block.get_num_images()).

Note

An important constraint to remember about coarray objects is that all segments sharing the same “last dimension index” are located in the same image.

Using co-arrays#

The member functions owned by the coarray objects are exactly the same as those of spmd multidimensional views. These are:

* Subscript-based operations
* Iterator-based operations

However, one additional functionality is provided. Knowing that the element a(i,j,k) is in the memory of the kth image, the use of local subscripts is possible.

Note

For spmd multidimensional views, subscripts are only global as it still involves potential remote data transfers.

Here is an example of using local subscripts:

#include <hpx/components/containers/coarray/coarray.hpp>
#include <hpx/collectives/spmd_block.hpp>

// The following code generates all necessary boiler plate to enable the
// co-creation of 'coarray'
//
HPX_REGISTER_COARRAY(double);

// Parallel section (suppose 'block' an spmd_block instance)
{
    using hpx::container::placeholders::_;

    std::size_t height=32, width=4, segment_size=10;

    hpx::coarray<double,3> a(block, "a", {height,width,_}, segment_size);

    double idx = block.this_image()*height*width;

    for (std::size_t j = 0; j<width; j++)
    for (std::size_t i = 0; i<height; i++)
    {
        // Local write operation performed via the use of local subscript
        a(i,j,_) = std::vector<double>(elt_size,idx);
        idx++;
    }

    block.sync_all();
}

Note

When the “last dimension index” of a subscript is equal to hpx::container::placeholders::_, local subscript (and not global subscript) is used. It is equivalent to a global subscript used with a “last dimension index” equal to the value returned by block.this_image().

Running on batch systems#

This section walks you through launching HPX applications on various batch systems.

How to use HPX applications with PBS#

Most HPX applications are executed on parallel computers. These platforms typically provide integrated job management services that facilitate the allocation of computing resources for each parallel program. HPX includes support for one of the most common job management systems, the Portable Batch System (PBS).

All PBS jobs require a script to specify the resource requirements and other parameters associated with a parallel job. The PBS script is basically a shell script with PBS directives placed within commented sections at the beginning of the file. The remaining (not commented-out) portions of the file executes just like any other regular shell script. While the description of all available PBS options is outside the scope of this tutorial (the interested reader may refer to in-depth documentation for more information), below is a minimal example to illustrate the approach. The following test application will use the multithreaded hello_world_distributed program, explained in the section Remote execution with actions.

#!/bin/bash
#
#PBS -l nodes=2:ppn=4

APP_PATH=~/packages/hpx/bin/hello_world_distributed
APP_OPTIONS=

pbsdsh -u $APP_PATH $APP_OPTIONS --hpx:nodes=`cat $PBS_NODEFILE`

Caution

If the first application specific argument (inside $APP_OPTIONS) is a non-option (i.e., does not start with a - or a --), then the argument has to be placed before the option --hpx:nodes, which, in this case, should be the last option on the command line.

Alternatively, use the option --hpx:endnodes to explicitly mark the end of the list of node names:

$ pbsdsh -u $APP_PATH --hpx:nodes`cat $PBS_NODEFILE` --hpx:endnodes $APP_OPTIONS

The #PBS -l nodes=2:ppn=4 directive will cause two compute nodes to be allocated for the application, as specified in the option nodes. Each of the nodes will dedicate four cores to the program, as per the option ppn, short for “processors per node” (PBS does not distinguish between processors and cores). Note that requesting more cores per node than physically available is pointless and may prevent PBS from accepting the script.

On newer PBS versions the PBS command syntax might be different. For instance, the PBS script above would look like:

#!/bin/bash
#
#PBS -l select=2:ncpus=4

APP_PATH=~/packages/hpx/bin/hello_world_distributed
APP_OPTIONS=

pbsdsh -u $APP_PATH $APP_OPTIONS --hpx:nodes=`cat $PBS_NODEFILE`

APP_PATH and APP_OPTIONS are shell variables that respectively specify the correct path to the executable (hello_world_distributed in this case) and the command line options. Since the hello_world_distributed application doesn’t need any command line options, APP_OPTIONS has been left empty. Unlike in other execution environments, there is no need to use the --hpx:threads option to indicate the required number of OS threads per node; the HPX library will derive this parameter automatically from PBS.

Finally, pbsdsh is a PBS command that starts tasks to the resources allocated to the current job. It is recommended to leave this line as shown and modify only the PBS options and shell variables as needed for a specific application.

Important

A script invoked by pbsdsh starts in a very basic environment: the user’s $HOME directory is defined and is the current directory, the LANG variable is set to C and the PATH is set to the basic /usr/local/bin:/usr/bin:/bin as defined in a system-wide file pbs_environment. Nothing that would normally be set up by a system shell profile or user shell profile is defined, unlike the environment for the main job script.

Another choice is for the pbsdsh command in your main job script to invoke your program via a shell, like sh or bash, so that it gives an initialized environment for each instance. Users can create a small script runme.sh, which is used to invoke the program:

#!/bin/bash
# Small script which invokes the program based on what was passed on its
# command line.
#
# This script is executed by the bash shell which will initialize all
# environment variables as usual.
$@

Now, the script is invoked using the pbsdsh tool:

#!/bin/bash
#
#PBS -l nodes=2:ppn=4

APP_PATH=~/packages/hpx/bin/hello_world_distributed
APP_OPTIONS=

pbsdsh -u runme.sh $APP_PATH $APP_OPTIONS --hpx:nodes=`cat $PBS_NODEFILE`

All that remains now is submitting the job to the queuing system. Assuming that the contents of the PBS script were saved in the file pbs_hello_world.sh in the current directory, this is accomplished by typing:

$ qsub ./pbs_hello_world_pbs.sh

If the job is accepted, qsub will print out the assigned job ID, which may look like:

$ 42.supercomputer.some.university.edu

To check the status of your job, issue the following command:

$ qstat 42.supercomputer.some.university.edu

and look for a single-letter job status symbol. The common cases include:

  • Q - signifies that the job is queued and awaiting its turn to be executed.

  • R - indicates that the job is currently running.

  • C - means that the job has completed.

The example qstat output below shows a job waiting for execution resources to become available:

Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
42.supercomputer          ...ello_world.sh joe_user               0 Q batch

After the job completes, PBS will place two files, pbs_hello_world.sh.o42 and pbs_hello_world.sh.e42, in the directory where the job was submitted. The first contains the standard output and the second contains the standard error from all the nodes on which the application executed. In our example, the error output file should be empty and the standard output file should contain something similar to:

hello world from OS-thread 3 on locality 0
hello world from OS-thread 2 on locality 0
hello world from OS-thread 1 on locality 1
hello world from OS-thread 0 on locality 0
hello world from OS-thread 3 on locality 1
hello world from OS-thread 2 on locality 1
hello world from OS-thread 1 on locality 0
hello world from OS-thread 0 on locality 1

Congratulations! You have just run your first distributed HPX application!

How to use HPX applications with SLURM#

Just like PBS (described in section How to use HPX applications with PBS), SLURM is a job management system which is widely used on large supercomputing systems. Any HPX application can easily be run using SLURM. This section describes how this can be done.

The easiest way to run an HPX application using SLURM is to utilize the command line tool srun, which interacts with the SLURM batch scheduling system:

$ srun -p <partition> -N <number-of-nodes> hpx-application <application-arguments>

Here, <partition> is one of the node partitions existing on the target machine (consult the machine’s documentation to get a list of existing partitions) and <number-of-nodes> is the number of compute nodes that should be used. By default, the HPX application is started with one locality per node and uses all available cores on a node. You can change the number of localities started per node (for example, to account for NUMA effects) by specifying the -n option of srun. The number of cores per locality can be set by -c. The <application-arguments> are any application specific arguments that need to be passed on to the application.

Note

There is no need to use any of the HPX command line options related to the number of localities, number of threads, or related to networking ports. All of this information is automatically extracted from the SLURM environment by the HPX startup code.

Important

The srun documentation explicitly states: “If -c is specified without -n, as many tasks will be allocated per node as possible while satisfying the -c restriction. For instance on a cluster with 8 CPUs per node, a job request for 4 nodes and 3 CPUs per task may be allocated 3 or 6 CPUs per node (1 or 2 tasks per node) depending upon resource consumption by other jobs.” For this reason, it’s recommended to always specify -n <number-of-instances>, even if <number-of-instances> is equal to one (1).

Interactive shells#

To get an interactive development shell on one of the nodes, users can issue the following command:

$ srun -p <node-type> -N <number-of-nodes> --pty /bin/bash -l

After the shell has been opened, users can run their HPX application. By default, it uses all available cores. Note that if you requested one node, you don’t need to do srun again. However, if you requested more than one node, and want to run your distributed application, you can use srun again to start up the distributed HPX application. It will use the resources that have been requested for the interactive shell.

Scheduling batch jobs#

The above mentioned method of running HPX applications is fine for development purposes. The disadvantage that comes with srun is that it only returns once the application is finished. This might not be appropriate for longer-running applications (for example, benchmarks or larger scale simulations). In order to cope with that limitation, users can use the sbatch command.

The sbatch command expects a script that it can run once the requested resources are available. In order to request resources, users need to add #SBATCH comments in their script or provide the necessary parameters to sbatch directly. The parameters are the same as with run. The commands you need to execute are the same you would need to start your application as if you were in an interactive shell.

Debugging HPX applications#

Using a debugger with HPX applications#

Using a debugger such as gdb with HPX applications is no problem. However, there are some things to keep in mind to make the experience somewhat more productive.

Call stacks in HPX can often be quite unwieldy as the library is heavily templated and the call stacks can be very deep. For this reason it is sometimes a good idea compile HPX in RelWithDebInfo mode, which applies some optimizations but keeps debugging symbols. This can often compress call stacks significantly. On the other hand, stepping through the code can also be more difficult because of statements being reordered and variables being optimized away. Also, note that because HPX implements user-space threads and context switching, call stacks may not always be complete in a debugger.

HPX launches not only worker threads but also a few helper threads. The first thread is the main thread, which typically does no work in an HPX application, except at startup and shutdown. If using the default settings, HPX will spawn six additional threads (used for service thread pools). The first worker thread is usually the eighth thread, and most user codes will be run on these worker threads. The last thread is a helper thread used for HPX shutdown.

Finally, since HPX is a multi-threaded runtime, the following gdb options can be helpful:

set pagination off
set non-stop on

Non-stop mode allows users to have a single thread stop on a breakpoint without stopping all other threads as well.

Using sanitizers with HPX applications#

Warning

Not all parts of HPX are sanitizer clean. This means that users may end up with false positives from HPX itself when using sanitizers for their applications.

To use sanitizers with HPX, turn on HPX_WITH_SANITIZERS and turn off HPX_WITH_STACKOVERFLOW_DETECTION during CMake configuration. It’s recommended to also build Boost with the same sanitizers that will be used for HPX. The appropriate sanitizers can then be enabled using CMake by appending -fsanitize=address -fno-omit-frame-pointer to CMAKE_CXX_FLAGS and -fsanitize=address to CMAKE_EXE_LINKER_FLAGS. Replace address with the sanitizer that you want to use.

Debugging applications using core files#

For HPX to generate useful core files, HPX has to be compiled without signal and exception handlers HPX_WITH_DISABLED_SIGNAL_EXCEPTION_HANDLERS:BOOL. If this option is not specified, the signal handlers change the application state. For example, after a segmentation fault the stack trace will show the signal handler. Similarly, unhandled exceptions are also caught by these handlers and the stack trace will not point to the location where the unhandled exception was thrown.

In general, core files are a helpful tool to inspect the state of the application at the moment of the crash (post-mortem debugging), without the need of attaching a debugger beforehand. This approach to debugging is especially useful if the error cannot be reliably reproduced, as only a single crashed application run is required to gain potentially helpful information like a stacktrace.

To debug with core files, the operating system first has to be told to actually write them. On most Unix systems this can be done by calling:

$ ulimit -c unlimited

in the shell. Now the debugger can be started up with:

$ gdb <application> <core file name>

The debugger should now display the last state of the application. The default file name for core files is core.

Optimizing HPX applications#

Performance counters#

Performance counters in HPX are used to provide information as to how well the runtime system or an application is performing. The counter data can help determine system bottlenecks, and fine-tune system and application performance. The HPX runtime system, its networking, and other layers provide counter data that an application can consume to provide users with information about how well the application is performing.

Applications can also use counter data to determine how much system resources to consume. For example, an application that transfers data over the network could consume counter data from a network switch to determine how much data to transfer without competing for network bandwidth with other network traffic. The application could use the counter data to adjust its transfer rate as the bandwidth usage from other network traffic increases or decreases.

Performance counters are HPX parallel processes that expose a predefined interface. HPX exposes special API functions that allow one to create, manage, and read the counter data, and release instances of performance counters. Performance Counter instances are accessed by name, and these names have a predefined structure which is described in the section Performance counter names. The advantage of this is that any Performance Counter can be accessed remotely (from a different locality) or locally (from the same locality). Moreover, since all counters expose their data using the same API, any code consuming counter data can be utilized to access arbitrary system information with minimal effort.

Counter data may be accessed in real time. More information about how to consume counter data can be found in the section Consuming performance counter data.

All HPX applications provide command line options related to performance counters, such as the ability to list available counter types, or periodically query specific counters to be printed to the screen or save them in a file. For more information, please refer to the section HPX Command Line Options.

Performance counter names#

All Performance Counter instances have a name uniquely identifying each instance. This name can be used to access the counter, retrieve all related meta data, and to query the counter data (as described in the section Consuming performance counter data). Counter names are strings with a predefined structure. The general form of a countername is:

/objectname{full_instancename}/countername@parameters

where full_instancename could be either another (full) counter name or a string formatted as:

parentinstancename#parentindex/instancename#instanceindex

Each separate part of a countername (e.g., objectname, countername parentinstancename, instancename, and parameters) should start with a letter ('a''z', 'A''Z') or an underscore character ('_'), optionally followed by letters, digits ('0''9'), hyphen ('-'), or underscore characters. Whitespace is not allowed inside a counter name. The characters '/', '{', '}', '#' and '@' have a special meaning and are used to delimit the different parts of the counter name.

The parts parentinstanceindex and instanceindex are integers. If an index is not specified, HPX will assume a default of -1.

Two counter name examples#

This section gives examples of both simple counter names and aggregate counter names. For more information on simple and aggregate counter names, please see Performance counter instances.

An example of a well-formed (and meaningful) simple counter name would be:

/threads{locality#0/total}/count/cumulative

This counter returns the current cumulative number of executed (retired) HPX threads for the locality 0. The counter type of this counter is /threads/count/cumulative and the full instance name is locality#0/total. This counter type does not require an instanceindex or parameters to be specified.

In this case, the parentindex (the '0') designates the locality for which the counter instance is created. The counter will return the number of HPX threads retired on that particular locality.

Another example for a well formed (aggregate) counter name is:

/statistics{/threads{locality#0/total}/count/cumulative}/average@500

This counter takes the simple counter from the first example, samples its values every 500 milliseconds, and returns the average of the value samples whenever it is queried. The counter type of this counter is /statistics/average and the instance name is the full name of the counter for which the values have to be averaged. In this case, the parameters (the '500') specify the sampling interval for the averaging to take place (in milliseconds).

Performance counter types#

Every performance counter belongs to a specific performance counter type which classifies the counters into groups of common semantics. The type of a counter is identified by the objectname and the countername parts of the name.

/objectname/countername

When an application starts HPX will register all available counter types on each of the localities. These counter types are held in a special performance counter registration database, which can be used to retrieve the meta data related to a counter type and to create counter instances based on a given counter instance name.

Performance counter instances#

The full_instancename distinguishes different counter instances of the same counter type. The formatting of the full_instancename depends on the counter type. There are two types of counters: simple counters, which usually generate the counter values based on direct measurements, and aggregate counters, which take another counter and transform its values before generating their own counter values. An example for a simple counter is given above: counting retired HPX threads. An aggregate counter is shown as an example above as well: calculating the average of the underlying counter values sampled at constant time intervals.

While simple counters use instance names formatted as parentinstancename#parentindex/instancename#instanceindex, most aggregate counters have the full counter name of the embedded counter as their instance name.

Not all simple counter types require specifying all four elements of a full counter instance name; some of the parts (parentinstancename, parentindex, instancename, and instanceindex) are optional for specific counters. Please refer to the documentation of a particular counter for more information about the formatting requirements for the name of this counter (see Existing HPX performance counters).

The parameters are used to pass additional information to a counter at creation time. They are optional, and they fully depend on the concrete counter. Even if a specific counter type allows additional parameters to be given, those usually are not required as sensible defaults will be chosen. Please refer to the documentation of a particular counter for more information about what parameters are supported, how to specify them, and what default values are assumed (see also Existing HPX performance counters).

Every locality of an application exposes its own set of performance counter types and performance counter instances. The set of exposed counters is determined dynamically at application start based on the execution environment of the application. For instance, this set is influenced by the current hardware environment for the locality (such as whether the locality has access to accelerators), and the software environment of the application (such as the number of OS threads used to execute HPX threads).

Using wildcards in performance counter names#

It is possible to use wildcard characters when specifying performance counter names. Performance counter names can contain two types of wildcard characters:

  • Wildcard characters in the performance counter type

  • Wildcard characters in the performance counter instance name

A wildcard character has a meaning which is very close to usual file name wildcard matching rules implemented by common shells (like bash).

Table 25 Wildcard characters in the performance counter type#

Wildcard

Description

*

This wildcard character matches any number (zero or more) of arbitrary characters.

?

This wildcard character matches any single arbitrary character.

[...]

This wildcard character matches any single character from the list of specified within the square brackets.

Table 26 Wildcard characters in the performance counter instance name#

Wildcard

Description

*

This wildcard character matches any locality or any thread, depending on whether it is used for locality#* or worker-thread#*. No other wildcards are allowed in counter instance names.

Consuming performance counter data#

You can consume performance data using either the command line interface, the HPX application or the HPX API. The command line interface is easier to use, but it is less flexible and does not allow one to adjust the behaviour of your application at runtime. The command line interface provides a convenience abstraction but simplified abstraction for querying and logging performance counter data for a set of performance counters.

Consuming performance counter data from the command line#

HPX provides a set of predefined command line options for every application that uses hpx::init for its initialization. While there are many more command line options available (see HPX Command Line Options), the set of options related to performance counters allows one to list existing counters, and query existing counters once at application termination or repeatedly after a constant time interval.

The following table summarizes the available command line options:

Table 27 HPX Command Line Options Related to Performance Counters#

Command line option

Description

--hpx:print-counter

Prints the specified performance counter either repeatedly and/or at the times specified by --hpx:print-counter-at (see also option --hpx:print-counter-interval).

--hpx:print-counter-reset

Prints the specified performance counter either repeatedly and/or at the times specified by --hpx:print-counter-at. Reset the counter after the value is queried (see also option --hpx:print-counter-interval).

--hpx:print-counter-interval

Prints the performance counter(s) specified with --hpx:print-counter repeatedly after the time interval (specified in milliseconds) (default:0 which means print once at shutdown).

--hpx:print-counter-destination

Prints the performance counter(s) specified with --hpx:print-counter to the given file (default: console).

--hpx:list-counters

Lists the names of all registered performance counters.

--hpx:list-counter-infos

Lists the description of all registered performance counters.

--hpx:print-counter-format

Prints the performance counter(s) specified with --hpx:print-counter. Possible formats in CVS format with header or without any header (see option --hpx:no-csv-header), possible values: csv (prints counter values in CSV format with full names as header) csv-short (prints counter values in CSV format with shortnames provided with --hpx:print-counter as --hpx:print-counter shortname,full-countername).

--hpx:no-csv-header

Prints the performance counter(s) specified with --hpx:print-counter and csv or csv-short format specified with --hpx:print-counter-format without header.

--hpx:print-counter-at arg

Prints the performance counter(s) specified with --hpx:print-counter (or --hpx:print-counter-reset) at the given point in time. Possible argument values: startup, shutdown (default), noshutdown.

--hpx:reset-counters

Resets all performance counter(s) specified with --hpx:print-counter after they have been evaluated.

--hpx:print-counter-types

Appends counter type description to generated output.

--hpx:print-counters-locally

Each locality prints only its own local counters.

While the options --hpx:list-counters and --hpx:list-counter-infos give a short list of all available counters, the full documentation for those can be found in the section Existing HPX performance counters.

A simple example#

All of the commandline options mentioned above can be tested using the hello_world_distributed example.

Listing all available counters hello_world_distributed --hpx:list-counters yields:

List of available counter instances (replace * below with the appropriate
sequence number)
-------------------------------------------------------------------------
/agas/count/allocate /agas/count/bind /agas/count/bind_gid
/agas/count/bind_name ... /threads{locality#*/allocator#*}/count/objects
/threads{locality#*/total}/count/stack-recycles
/threads{locality#*/total}/idle-rate
/threads{locality#*/worker-thread#*}/idle-rate

Providing more information about all available counters, hello_world_distributed --hpx:list-counter-infos yields:

Information about available counter instances (replace * below with the
appropriate sequence number)
------------------------------------------------------------------------------
fullname: /agas/count/allocate helptext: returns the number of invocations of
the AGAS service 'allocate' type: counter_type::raw version: 1.0.0
------------------------------------------------------------------------------

------------------------------------------------------------------------------
fullname: /agas/count/bind helptext: returns the number of invocations of the
AGAS service 'bind' type: counter_type::raw version: 1.0.0
------------------------------------------------------------------------------

------------------------------------------------------------------------------
fullname: /agas/count/bind_gid helptext: returns the number of invocations of
the AGAS service 'bind_gid' type: counter_type::raw version: 1.0.0
------------------------------------------------------------------------------

...

This command will not only list the counter names but also a short description of the data exposed by this counter.

Note

The list of available counters may differ depending on the concrete execution environment (hardware or software) of your application.

Requesting the counter data for one or more performance counters can be achieved by invoking hello_world_distributed with a list of counter names:

$ hello_world_distributed \
    --hpx:print-counter=/threads{locality#0/total}/count/cumulative \
    --hpx:print-counter=/agas{locality#0/total}/count/bind

which yields for instance:

hello world from OS-thread 0 on locality 0
/threads{locality#0/total}/count/cumulative,1,0.212527,[s],33
/agas{locality#0/total}/count/bind,1,0.212790,[s],11

The first line is the normal output generated by hello_world_distributed and has no relation to the counter data listed. The last two lines contain the counter data as gathered at application shutdown. These lines have six fields, the counter name, the sequence number of the counter invocation, the time stamp at which this information has been sampled, the unit of measure for the time stamp, the actual counter value and an optional unit of measure for the counter value.

Note

The command line option --hpx:print-counter-types will append a seventh field to the generated output. This field will hold an abbreviated counter type.

The actual counter value can be represented by a single number (for counters returning singular values) or a list of numbers separated by ':' (for counters returning an array of values, like for instance a histogram).

Note

The name of the performance counter will be enclosed in double quotes '"' if it contains one or more commas ','.

Requesting to query the counter data once after a constant time interval with this command line:

$ hello_world_distributed \
    --hpx:print-counter=/threads{locality#0/total}/count/cumulative \
    --hpx:print-counter=/agas{locality#0/total}/count/bind \
    --hpx:print-counter-interval=20

yields for instance (leaving off the actual console output of the hello_world_distributed example for brevity):

threads{locality#0/total}/count/cumulative,1,0.002409,[s],22
agas{locality#0/total}/count/bind,1,0.002542,[s],9
threads{locality#0/total}/count/cumulative,2,0.023002,[s],41
agas{locality#0/total}/count/bind,2,0.023557,[s],10
threads{locality#0/total}/count/cumulative,3,0.037514,[s],46
agas{locality#0/total}/count/bind,3,0.038679,[s],10

The command --hpx:print-counter-destination=<file> will redirect all counter data gathered to the specified file name, which avoids cluttering the console output of your application.

The command line option --hpx:print-counter supports using a limited set of wildcards for a (very limited) set of use cases. In particular, all occurrences of #* as in locality#* and in worker-thread#* will be automatically expanded to the proper set of performance counter names representing the actual environment for the executed program. For instance, if your program is utilizing four worker threads for the execution of HPX threads (see command line option --hpx:threads) the following command line

$ hello_world_distributed \
    --hpx:threads=4 \
    --hpx:print-counter=/threads{locality#0/worker-thread#*}/count/cumulative

will print the value of the performance counters monitoring each of the worker threads:

hello world from OS-thread 1 on locality 0
hello world from OS-thread 0 on locality 0
hello world from OS-thread 3 on locality 0
hello world from OS-thread 2 on locality 0
/threads{locality#0/worker-thread#0}/count/cumulative,1,0.0025214,[s],27
/threads{locality#0/worker-thread#1}/count/cumulative,1,0.0025453,[s],33
/threads{locality#0/worker-thread#2}/count/cumulative,1,0.0025683,[s],29
/threads{locality#0/worker-thread#3}/count/cumulative,1,0.0025904,[s],33

The command --hpx:print-counter-format takes values csv and csv-short to generate CSV formatted counter values with a header.

With format as csv:

$ hello_world_distributed \
    --hpx:threads=2 \
    --hpx:print-counter-format csv \
    --hpx:print-counter /threads{locality#*/total}/count/cumulative \
    --hpx:print-counter /threads{locality#*/total}/count/cumulative-phases

will print the values of performance counters in CSV format with the full countername as a header:

hello world from OS-thread 1 on locality 0
hello world from OS-thread 0 on locality 0
/threads{locality#*/total}/count/cumulative,/threads{locality#*/total}/count/cumulative-phases
39,93

With format csv-short:

$ hello_world_distributed \
    --hpx:threads 2 \
    --hpx:print-counter-format csv-short \
    --hpx:print-counter cumulative,/threads{locality#*/total}/count/cumulative \
    --hpx:print-counter phases,/threads{locality#*/total}/count/cumulative-phases

will print the values of performance counters in CSV format with the short countername as a header:

hello world from OS-thread 1 on locality 0
hello world from OS-thread 0 on locality 0
cumulative,phases
39,93

With format csv and csv-short when used with --hpx:print-counter-interval:

$ hello_world_distributed \
    --hpx:threads 2 \
    --hpx:print-counter-format csv-short \
    --hpx:print-counter cumulative,/threads{locality#*/total}/count/cumulative \
    --hpx:print-counter phases,/threads{locality#*/total}/count/cumulative-phases \
    --hpx:print-counter-interval 5

will print the header only once repeating the performance counter value(s) repeatedly:

cum,phases
25,42
hello world from OS-thread 1 on locality 0
hello world from OS-thread 0 on locality 0
44,95

The command --hpx:no-csv-header can be used with --hpx:print-counter-format to print performance counter values in CSV format without any header:

$ hello_world_distributed \
--hpx:threads 2 \
--hpx:print-counter-format csv-short \
--hpx:print-counter cumulative,/threads{locality#*/total}/count/cumulative \
--hpx:print-counter phases,/threads{locality#*/total}/count/cumulative-phases \
--hpx:no-csv-header

will print:

hello world from OS-thread 1 on locality 0
hello world from OS-thread 0 on locality 0
37,91
Consuming performance counter data using the HPX API#

HPX provides an API that allows users to discover performance counters and to retrieve the current value of any existing performance counter from any application.

Discover existing performance counters#
Retrieve the current value of any performance counter#

Performance counters are specialized HPX components. In order to retrieve a counter value, the performance counter needs to be instantiated. HPX exposes a client component object for this purpose:

hpx::performance_counters::performance_counter counter(std::string const& name);

Instantiating an instance of this type will create the performance counter identified by the given name. Only the first invocation for any given counter name will create a new instance of that counter. All following invocations for a given counter name will reference the initially created instance. This ensures that at any point in time there is never more than one active instance of any of the existing performance counters.

In order to access the counter value (or to invoke any of the other functionality related to a performance counter, like start, stop or reset) member functions of the created client component instance should be called:

// print the current number of threads created on locality 0
hpx::performance_counters::performance_counter count(
    "/threads{locality#0/total}/count/cumulative");
hpx::cout << count.get_value<int>().get() << std::endl;

For more information about the client component type, see hpx::performance_counters::performance_counter

Note

In the above example count.get_value() returns a future. In order to print the result we must append .get() to retrieve the value. You could write the above example like this for more clarity:

// print the current number of threads created on locality 0
hpx::performance_counters::performance_counter count(
    "/threads{locality#0/total}/count/cumulative");
hpx::future<int> result = count.get_value<int>();
hpx::cout << result.get() << std::endl;
Providing performance counter data#

HPX offers several ways by which you may provide your own data as a performance counter. This has the benefit of exposing additional, possibly application-specific information using the existing Performance Counter framework, unifying the process of gathering data about your application.

An application that wants to provide counter data can implement a performance counter to provide the data. When a consumer queries performance data, the HPX runtime system calls the provider to collect the data. The runtime system uses an internal registry to determine which provider to call.

Generally, there are two ways of exposing your own performance counter data: a simple, function-based way and a more complex, but more powerful way of implementing a full performance counter. Both alternatives are described in the following sections.

Exposing performance counter data using a simple function#

The simplest way to expose arbitrary numeric data is to write a function which will then be called whenever a consumer queries this counter. Currently, this type of performance counter can only be used to expose integer values. The expected signature of this function is:

std::int64_t some_performance_data(bool reset);

The argument bool reset (which is supplied by the runtime system when the function is invoked) specifies whether the counter value should be reset after evaluating the current value (if applicable).

For instance, here is such a function returning how often it was invoked:

// The atomic variable 'counter' ensures the thread safety of the counter.
boost::atomic<std::int64_t> counter(0);

std::int64_t some_performance_data(bool reset)
{
    std::int64_t result = ++counter;
    if (reset)
        counter = 0;
    return result;
}

This example function exposes a linearly-increasing value as our performance data. The value is incremented on each invocation, i.e., each time a consumer requests the counter data of this performance counter.

The next step in exposing this counter to the runtime system is to register the function as a new raw counter type using the HPX API function hpx::performance_counters::install_counter_type. A counter type represents certain common characteristics of counters, like their counter type name and any associated description information. The following snippet shows an example of how to register the function some_performance_data, which is shown above, for a counter type named "/test/data". This registration has to be executed before any consumer instantiates, and queries an instance of this counter type:

#include <hpx/include/performance_counters.hpp>

void register_counter_type()
{
    // Call the HPX API function to register the counter type.
    hpx::performance_counters::install_counter_type(
        "/test/data",                                   // counter type name
        &some_performance_data,                         // function providing counter data
        "returns a linearly increasing counter value"   // description text (optional)
        ""                                              // unit of measure (optional)
    );
}

Now it is possible to instantiate a new counter instance based on the naming scheme "/test{locality#*/total}/data" where * is a zero-based integer index identifying the locality for which the counter instance should be accessed. The function hpx::performance_counters::install_counter_type enables users to instantiate exactly one counter instance for each locality. Repeated requests to instantiate such a counter will return the same instance, i.e., the instance created for the first request.

If this counter needs to be accessed using the standard HPX command line options, the registration has to be performed during application startup, before hpx_main is executed. The best way to achieve this is to register an HPX startup function using the API function hpx::register_startup_function before calling hpx::init to initialize the runtime system:

int main(int argc, char* argv[])
{
    // By registering the counter type we make it available to any consumer
    // who creates and queries an instance of the type "/test/data".
    //
    // This registration should be performed during startup. The
    // function 'register_counter_type' should be executed as an HPX thread right
    // before hpx_main is executed.
    hpx::register_startup_function(&register_counter_type);

    // Initialize and run HPX.
    return hpx::init(argc, argv);
}

Please see the code in simplest_performance_counter.cpp for a full example demonstrating this functionality.

Implementing a full performance counter#

Sometimes, the simple way of exposing a single value as a performance counter is not sufficient. For that reason, HPX provides a means of implementing full performance counters which support:

  • Retrieving the descriptive information about the performance counter

  • Retrieving the current counter value

  • Resetting the performance counter (value)

  • Starting the performance counter

  • Stopping the performance counter

  • Setting the (initial) value of the performance counter

Every full performance counter will implement a predefined interface:

//  Copyright (c) 2007-2023 Hartmut Kaiser
//
//  SPDX-License-Identifier: BSL-1.0
//  Distributed under the Boost Software License, Version 1.0. (See accompanying
//  file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)

#pragma once

#include <hpx/config.hpp>
#include <hpx/components/client_base.hpp>
#include <hpx/modules/async_base.hpp>
#include <hpx/modules/execution.hpp>
#include <hpx/modules/functional.hpp>
#include <hpx/modules/futures.hpp>

#include <hpx/performance_counters/counters_fwd.hpp>
#include <hpx/performance_counters/server/base_performance_counter.hpp>

#include <string>
#include <utility>
#include <vector>

#include <hpx/config/warnings_prefix.hpp>

///////////////////////////////////////////////////////////////////////////////
namespace hpx::performance_counters {

    ///////////////////////////////////////////////////////////////////////////
    struct HPX_EXPORT performance_counter
      : components::client_base<performance_counter,
            server::base_performance_counter>
    {
        using base_type = components::client_base<performance_counter,
            server::base_performance_counter>;

        performance_counter() = default;

        explicit performance_counter(std::string const& name);

        performance_counter(
            std::string const& name, hpx::id_type const& locality);

        performance_counter(id_type const& id)
          : base_type(id)
        {
        }

        performance_counter(future<id_type>&& id)
          : base_type(HPX_MOVE(id))
        {
        }

        performance_counter(hpx::future<performance_counter>&& c)
          : base_type(HPX_MOVE(c))
        {
        }

        ///////////////////////////////////////////////////////////////////////
        future<counter_info> get_info() const;
        counter_info get_info(
            launch::sync_policy, error_code& ec = throws) const;

        future<counter_value> get_counter_value(bool reset) const;
        counter_value get_counter_value(
            launch::sync_policy, bool reset, error_code& ec = throws) const;

        future<counter_value> get_counter_value() const;
        counter_value get_counter_value(
            launch::sync_policy, error_code& ec = throws) const;

        future<counter_values_array> get_counter_values_array(bool reset) const;
        counter_values_array get_counter_values_array(
            launch::sync_policy, bool reset, error_code& ec = throws) const;

        future<counter_values_array> get_counter_values_array() const;
        counter_values_array get_counter_values_array(
            launch::sync_policy, error_code& ec = throws) const;

        ///////////////////////////////////////////////////////////////////////
        future<bool> start() const;
        bool start(launch::sync_policy, error_code& ec = throws) const;

        future<bool> stop() const;
        bool stop(launch::sync_policy, error_code& ec = throws) const;

        future<void> reset() const;
        void reset(launch::sync_policy, error_code& ec = throws) const;

        future<void> reinit(bool reset = true) const;
        void reinit(launch::sync_policy, bool reset = true,
            error_code& ec = throws) const;

        ///////////////////////////////////////////////////////////////////////
        future<std::string> get_name() const;
        std::string get_name(
            launch::sync_policy, error_code& ec = throws) const;

    private:
        template <typename T>
        static T extract_value(future<counter_value>&& value)
        {
            return value.get().get_value<T>();
        }

    public:
        template <typename T>
        future<T> get_value(bool reset = false)
        {
            return get_counter_value(reset).then(hpx::launch::sync,
                hpx::bind_front(&performance_counter::extract_value<T>));
        }
        template <typename T>
        T get_value(
            launch::sync_policy, bool reset = false, error_code& ec = throws)
        {
            return get_counter_value(launch::sync, reset).get_value<T>(ec);
        }

        template <typename T>
        future<T> get_value() const
        {
            return get_counter_value(false).then(hpx::launch::sync,
                hpx::bind_front(&performance_counter::extract_value<T>));
        }
        template <typename T>
        T get_value(launch::sync_policy, error_code& ec = throws) const
        {
            return get_counter_value(launch::sync, false).get_value<T>(ec);
        }
    };

    // Return all counters matching the given name (with optional wild cards).
    HPX_EXPORT std::vector<performance_counter> discover_counters(
        std::string const& name, error_code& ec = throws);
}    // namespace hpx::performance_counters

#include <hpx/config/warnings_suffix.hpp>

In order to implement a full performance counter, you have to create an HPX component exposing this interface. To simplify this task, HPX provides a ready-made base class which handles all the boiler plate of creating a component for you. The remainder of this section will explain the process of creating a full performance counter based on the Sine example, which you can find in the directory examples/performance_counters/sine/.

The base class is defined in the header file base_performance_counter.cpp as:

//  Copyright (c) 2007-2025 Hartmut Kaiser
//
//  SPDX-License-Identifier: BSL-1.0
//  Distributed under the Boost Software License, Version 1.0. (See accompanying
//  file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)

#pragma once

#include <hpx/config.hpp>
#include <hpx/actions_base/component_action.hpp>
#include <hpx/components_base/component_type.hpp>
#include <hpx/components_base/server/component_base.hpp>
#include <hpx/modules/runtime_local.hpp>
#include <hpx/performance_counters/counters.hpp>
#include <hpx/performance_counters/server/base_performance_counter.hpp>

///////////////////////////////////////////////////////////////////////////////
//[performance_counter_base_class
namespace hpx::performance_counters {

    template <typename Derived>
    class base_performance_counter;
}    // namespace hpx::performance_counters
//]

///////////////////////////////////////////////////////////////////////////////
namespace hpx::performance_counters {

    template <typename Derived>
    class base_performance_counter
      : public hpx::performance_counters::server::base_performance_counter
      , public hpx::components::component_base<Derived>
    {
    private:
        using base_type = hpx::components::component_base<Derived>;

    public:
        using type_holder = Derived;
        using base_type_holder =
            hpx::performance_counters::server::base_performance_counter;

        // NOLINTBEGIN(bugprone-crtp-constructor-accessibility)
        base_performance_counter() = default;

        explicit base_performance_counter(
            hpx::performance_counters::counter_info const& info)
          : base_type_holder(info)
        {
        }
        // NOLINTEND(bugprone-crtp-constructor-accessibility)

        // Disambiguate finalize() which is implemented in both base classes
        void finalize()
        {
            base_type_holder::finalize();
            base_type::finalize();
        }

        hpx::naming::address get_current_address() const
        {
            return hpx::naming::address(
                hpx::naming::get_gid_from_locality_id(hpx::get_locality_id()),
                hpx::components::get_component_type<Derived>(),
                const_cast<Derived*>(static_cast<Derived const*>(this)));
        }
    };
}    // namespace hpx::performance_counters

The single template parameter is expected to receive the type of the derived class implementing the performance counter. In the Sine example this looks like:

//  Copyright (c) 2007-2012 Hartmut Kaiser
//
//  SPDX-License-Identifier: BSL-1.0
//  Distributed under the Boost Software License, Version 1.0. (See accompanying
//  file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)

#pragma once

#include <hpx/config.hpp>
#if !defined(HPX_COMPUTE_DEVICE_CODE)
#include <hpx/hpx.hpp>
#include <hpx/include/lcos_local.hpp>
#include <hpx/include/performance_counters.hpp>
#include <hpx/include/util.hpp>

#include <cstdint>

namespace performance_counters { namespace sine { namespace server {
    ///////////////////////////////////////////////////////////////////////////
    //[sine_counter_definition
    class sine_counter
      : public hpx::performance_counters::base_performance_counter<sine_counter>
    //]
    {
    public:
        sine_counter()
          : current_value_(0)
          , evaluated_at_(0)
        {
        }
        explicit sine_counter(
            hpx::performance_counters::counter_info const& info);

        /// This function will be called in order to query the current value of
        /// this performance counter
        hpx::performance_counters::counter_value get_counter_value(bool reset);

        /// The functions below will be called to start and stop collecting
        /// counter values from this counter.
        bool start();
        bool stop();

        /// finalize() will be called just before the instance gets destructed
        void finalize();

    protected:
        bool evaluate();

    private:
        typedef hpx::spinlock mutex_type;

        mutable mutex_type mtx_;
        double current_value_;
        std::uint64_t evaluated_at_;

        hpx::util::interval_timer timer_;
    };
}}}    // namespace performance_counters::sine::server
#endif

i.e., the type sine_counter is derived from the base class passing the type as a template argument (please see simplest_performance_counter.cpp for the full source code of the counter definition). For more information about this technique (called Curiously Recurring Template Pattern - CRTP), please see for instance the corresponding Wikipedia article. This base class itself is derived from the performance_counter interface described above.

Additionally, a full performance counter implementation not only exposes the actual value but also provides information about:

  • The point in time a particular value was retrieved.

  • A (sequential) invocation count.

  • The actual counter value.

  • An optional scaling coefficient.

  • Information about the counter status.

Existing HPX performance counters#

The HPX runtime system exposes a wide variety of predefined performance counters. These counters expose critical information about different modules of the runtime system. They can help determine system bottlenecks and fine-tune system and application performance.

Table 28 AGAS performance counter /agas/count/<agas_service>#

Counter type

/agas/count/<agas_service>

where <agas_service> is one of the following:

primary namespace services: route, bind_gid, resolve_gid, unbind_gid, increment_credit, decrement_credit, allocate, begin_migration, end_migration

component namespace services: bind_prefix, bind_name, resolve_id, unbind_name, iterate_types, get_component_typename, num_localities_type

locality namespace services: free, localities, num_localities, num_threads, resolve_locality, resolved_localities

symbol namespace services: bind, resolve, unbind, iterate_names, on_symbol_namespace_event

Counter instance formatting

<agas_instance>/total

where <agas_instance> is the name of the AGAS service to query. Currently, this value will be locality#0 where 0 is the root locality (the id of the locality hosting the AGAS service).

The value for * can be any locality id for the following <agas_service>: route, bind_gid, resolve_gid, unbind_gid, increment_credit, decrement_credit, bin, resolve, unbind, and iterate_names (only the primary and symbol AGAS service components live on all localities, whereas all other AGAS services are available on locality#0 only).

Description

Returns the total number of invocations of the specified AGAS service since its creation.

Table 29 AGAS performance counter /agas/<agas_service_category>/count#

Counter type

/agas/<agas_service_category>/count

where <agas_service_category> is one of the following: primary, locality, component or symbol

Counter instance formatting

<agas_instance>/total

where <agas_instance> is the name of the AGAS service to query. Currently, this value will be locality#0 where 0 is the root locality (the id of the locality hosting the AGAS service). Except for <agas_service_category>, primary or symbol for which the value for * can be any locality id (only the primary and symbol AGAS service components live on all localities, whereas all other AGAS services are available on locality#0 only).

Description

Returns the overall total number of invocations of all AGAS services provided by the given AGAS service category since its creation.

Table 30 AGAS performance counter /agas/<agas_service_category>/count#

Counter type

/agas/<agas_service_category>/count

where <agas_service_category> is one of the following: primary, locality, component or symbol

Counter instance formatting

<agas_instance>/total

where <agas_instance> is the name of the AGAS service to query. Currently, this value will be locality#0 where 0 is the root locality (the id of the locality hosting the AGAS service). Except for <agas_service_category>, primary or symbol for which the value for * can be any locality id (only the primary and symbol AGAS service components live on all localities, whereas all other AGAS services are available on locality#0 only).

Description

Returns the overall total number of invocations of all AGAS services provided by the given AGAS service category since its creation.

Table 31 AGAS performance counter agas/time/<agas_service>#

Counter type

agas/time/<agas_service>

where <agas_service> is one of the following:

primary namespace services: route, bind_gid, resolve_gid, unbind_gid, increment_credit, decrement_credit, allocate begin_migration, end_migration

component namespace services: bind_prefix, bind_name, resolve_id, unbind_name, iterate_types, get_component_typename, num_localities_type

locality namespace services: free, localities, num_localities, num_threads, resolve_locality, resolved_localities

symbol namespace services: bind, resolve, unbind, iterate_names, on_symbol_namespace_event

Counter instance formatting

<agas_instance>/total

where <agas_instance> is the name of the AGAS service to query. Currently, this value will be locality#0 where 0 is the root locality (the id of the locality hosting the AGAS service).

The value for * can be any locality id for the following <agas_service>: route, bind_gid, resolve_gid, unbind_gid, increment_credit, decrement_credit, bin, resolve, unbind, and iterate_names (only the primary and symbol AGAS service components live on all localities, whereas all other AGAS services are available on locality#0 only).

Description

Returns the overall execution time of the specified AGAS service since its creation (in nanoseconds).

Table 32 AGAS performance counter /agas/<agas_service_category>/time`#

Counter type

/agas/<agas_service_category>/time

where <agas_service_category> is one of the following: primary, locality, component or symbol

Counter instance formatting

<agas_instance>/total

where <agas_instance> is the name of the AGAS service to query. Currently, this value will be locality#0 where 0 is the root locality (the id of the locality hosting the AGAS service). Except for <agas_service_category primary or symbol for which the value for * can be any locality id (only the primary and symbol AGAS service components live on all localities, whereas all other AGAS services are available on locality#0 only).

Description

Returns the overall execution time of all AGAS services provided by the given AGAS service category since its creation (in nanoseconds).

Table 33 AGAS performance counter /agas/count/entries#

Counter type

/agas/count/entries

Counter instance formatting

locality#*/total

where * is the locality id of the locality the AGAS cache should be queried. The locality id is a (zero based) number identifying the locality.

Description

Returns the number of cache entries resident in the AGAS cache of the specified locality (see <cache_statistics>).

Table 34 AGAS performance counter /agas/count/<cache_statistics>#

Counter type

/agas/count/<cache_statistics>

where <cache_statistics> is one of the following: cache/evictions, cache/hits, cache/insertions, cache/misses

Counter instance formatting

locality#*/total

where * is the locality id of the locality the AGAS cache should be queried. The locality id is a (zero based) number identifying the locality

Description

Returns the number of cache events (evictions, hits, inserts, and misses) in the AGAS cache of the specified locality (see <cache_statistics>).

Table 35 AGAS performance counter /agas/count/<full_cache_statistics>#

Counter type

/agas/count/<full_cache_statistics>

where <full_cache_statistics> is one of the following: cache/get_entry, cache/insert_entry, cache/update_entry, cache/erase_entry

Counter instance formatting

locality#*/total

where * is the locality id of the locality the AGAS cache should be queried. The locality id is a (zero based) number identifying the locality.

Description

Returns the number of invocations of the specified cache API function of the AGAS cache.

Table 36 AGAS performance counter /agas/time/<full_cache_statistics>#

Counter type

/agas/time/<full_cache_statistics>

where <full_cache_statistics> is one of the following:

cache/get_entry, cache/insert_entry, cache/update_entry, cache/erase_entry

Counter instance formatting

locality#*/total

where * is the locality id of the locality the AGAS cache should be queried. The locality id is a (zero based) number identifying the locality.

Description

Returns the overall time spent executing of the specified API function of the AGAS cache.

Table 37 Parcel layer performance counter /data/count/<connection_type>/<operation>#

Counter type

/data/count/<connection_type>/<operation>

where: <operation> is one of the following: sent, received

<connection_type> is one of the following: tcp, mpi

Counter instance formatting

locality#*/total

where * is the locality id of the locality the overall number of transmitted bytes should be queried for. The locality id is a (zero based) number identifying the locality.

Description

Returns the overall number of raw (uncompressed) bytes sent or received (see <operation>, e.g. sent or received) for the specified <connection_type>.

The performance counters are available only if the compile time constant HPX_HAVE_PARCELPORT_COUNTERS was defined while compiling the HPX core library (which is not defined by default). The corresponding cmake configuration constant is HPX_WITH_PARCELPORT_COUNTERS.

The performance counters for the connection type mpi are available only if the compile time constant HPX_HAVE_PARCELPORT_MPI was defined while compiling the HPX core library (which is not defined by default). The corresponding cmake configuration constant is HPX_WITH_PARCELPORT_MPI.

Please see CMake options for more details.

Table 38 Parcel layer performance counter /data/time/<connection_type>/<operation>#

Counter type

/data/time/<connection_type>/<operation>

where:

<operation> is one of the following: sent, received

<connection_type> is one of the following: tcp, mpi

Counter instance formatting

locality#*/total

where * is the locality id of the locality the total transmission time should be queried for. The locality id is a (zero based) number identifying the locality.

Description

Returns the total time (in nanoseconds) between the start of each asynchronous transmission operation and the end of the corresponding operation for the specified <connection_type> the given locality (see <operation>, e.g. sent or received).

The performance counters are available only if the compile time constant HPX_HAVE_PARCELPORT_COUNTERS was defined while compiling the HPX core library (which is not defined by default). The corresponding cmake configuration constant is HPX_WITH_PARCELPORT_COUNTERS.

The performance counters for the connection type mpi are available only if the compile time constant HPX_HAVE_PARCELPORT_MPI was defined while compiling the HPX core library (which is not defined by default). The corresponding cmake configuration constant is HPX_WITH_PARCELPORT_MPI.

Please see CMake options for more details.

Table 39 Parcel layer performance counter /serialize/count/<connection_type>/<operation>#

Counter type

/serialize/count/<connection_type>/<operation>

where:

<operation> is one of the following: sent, received

<connection_type> is one of the following: tcp, mpi

Counter instance formatting

locality#*/total

where * is the locality id of the locality the overall number of transmitted bytes should be queried for. The locality id is a (zero based) number identifying the locality.

Description

Returns the overall number of bytes transferred (see <operation>, e.g. sent or received possibly compressed) for the specified <connection_type> by the given locality.

The performance counters are available only if the compile time constant HPX_HAVE_PARCELPORT_COUNTERS was defined while compiling the HPX core library (which is not defined by default). The corresponding cmake configuration constant is HPX_WITH_PARCELPORT_COUNTERS.

The performance counters for the connection type mpi are available only if the compile time constant HPX_HAVE_PARCELPORT_MPI was defined while compiling the HPX core library (which is not defined by default). The corresponding cmake configuration constant is HPX_WITH_PARCELPORT_MPI.

Please see CMake options for more details.

Description

If the configure-time option -DHPX_WITH_PARCELPORT_ACTION_COUNTERS=On was specified, this counter allows one to specify an optional action name as its parameter. In this case the counter will report the number of bytes transmitted for the given action only.

Table 40 Parcel layer performance counter /serialize/time/<connection_type>/<operation>#

Counter type

/serialize/time/<connection_type>/<operation>

where:

<operation> is one of the following: sent, received

<connection_type> is one of the following: tcp, mpi

Counter instance formatting

locality#*/total

where * is the locality id of the locality the serialization time should be queried for. The locality id is a (zero based) number identifying the locality.

Description

Returns the overall time spent performing outgoing data serialization for the specified <connection_type> on the given locality (see <operation>, e.g. sent or received).

The performance counters are available only if the compile time constant HPX_HAVE_PARCELPORT_COUNTERS was defined while compiling the HPX core library (which is not defined by default). The corresponding cmake configuration constant is HPX_WITH_PARCELPORT_COUNTERS.

The performance counters for the connection type mpi are available only if the compile time constant HPX_HAVE_PARCELPORT_MPI was defined while compiling the HPX core library (which is not defined by default). The corresponding cmake configuration constant is HPX_WITH_PARCELPORT_MPI.

Please see CMake options for more details.

Parameters

If the configure-time option -DHPX_WITH_PARCELPORT_ACTION_COUNTERS=On was specified, this counter allows one to specify an optional action name as its parameter. In this case the counter will report the serialization time for the given action only.

Table 41 Parcel layer performance counter /parcels/count/routed#

Counter type

/parcels/count/routed

Counter instance formatting

locality#*/total

where * is the locality id of the locality the number of routed parcels should be queried for. The locality id is a (zero based) number identifying the locality.

Description

Returns the overall number of routed (outbound) parcels transferred by the given locality.

Routed parcels are those which cannot directly be delivered to its destination as the local AGAS is not able to resolve the destination address. In this case a parcel is sent to the AGAS service component which is responsible for creating the destination GID (and is responsible for resolving the destination address). This AGAS service component will deliver the parcel to its final target.

Parameters

If the configure-time option -DHPX_WITH_PARCELPORT_ACTION_COUNTERS=On was specified, this counter allows one to specify an optional action name as its parameter. In this case the counter will report the number of parcels for the given action only.

Table 42 Parcel layer performance counter /parcels/count/<connection_type>/<operation>#

Counter type

/parcels/count/<connection_type>/<operation>

where:

<operation> is one of the following: sent, received

<connection_type> is one of the following: tcp, mpi

Counter instance formatting

locality#*/total

where * is the locality id of the locality the number of parcels should be queried for. The locality id is a (zero based) number identifying the locality.

Description

Returns the overall number of parcels transferred using the specified <connection_type> by the given locality (see operation>, e.g. sent or received.

The performance counters are available only if the compile time constant HPX_HAVE_PARCELPORT_COUNTERS was defined while compiling the HPX core library (which is not defined by default). The corresponding cmake configuration constant is HPX_WITH_PARCELPORT_COUNTERS.

The performance counters for the connection type mpi are available only if the compile time constant HPX_HAVE_PARCELPORT_MPI was defined while compiling the HPX core library (which is not defined by default). The corresponding cmake configuration constant is HPX_WITH_PARCELPORT_MPI.

Please see CMake options for more details.

Table 43 Parcel layer performance counter /messages/count/<connection_type>/<operation>#

Counter type

/messages/count/<connection_type>/<operation> where:

<operation> is one of the following: sent, received

<connection_type> is one of the following: tcp, mpi

Counter instance formatting

locality#*/total

where * is the locality id of the locality the number of messages should be queried for. The locality id is a (zero based) number identifying the locality.

Description

Returns the overall number of messages 1 transferred using the specified <connection_type> by the given locality (see <operation>, e.g. sent or received)

The performance counters are available only if the compile time constant HPX_HAVE_PARCELPORT_COUNTERS was defined while compiling the HPX core library (which is not defined by default). The corresponding cmake configuration constant is HPX_WITH_PARCELPORT_COUNTERS.

The performance counters for the connection type mpi are available only if the compile time constant HPX_HAVE_PARCELPORT_MPI was defined while compiling the HPX core library (which is not defined by default). The corresponding cmake configuration constant is HPX_WITH_PARCELPORT_MPI.

Please see CMake options for more details.

Table 44 Parcel layer performance counter /parcelport/count/<connection_type>/zero_copy_chunks/<operation>#

Counter type

/parcelport/count/<connection_type>/zero_copy_chunks/<operation>

where:

<operation> is one of the following: sent, received

<connection_type> is one of the following: tcp, mpi

Counter instance formatting

locality#*/total

where * is the locality id of the locality the overall number of transmitted bytes should be queried for. The locality id is a (zero based) number identifying the locality.

Description

Returns the overall number of zero-copy chunks sent or received (see <operation>, e.g. sent or received) for the specified <connection_type>.

The performance counters are available only if the compile time constant HPX_HAVE_PARCELPORT_COUNTERS was defined while compiling the HPX core library (which is not defined by default). The corresponding cmake configuration constant is HPX_WITH_PARCELPORT_COUNTERS.

The performance counters for the connection type mpi are available only if the compile time constant HPX_HAVE_PARCELPORT_MPI was defined while compiling the HPX core library (which is not defined by default). The corresponding cmake configuration constant is HPX_WITH_PARCELPORT_MPI.

Please see CMake options for more details.

Table 45 Parcel layer performance counter /parcelport/count-max/<connection_type>/zero_copy_chunks/<operation>#

Counter type

/parcelport/count-max/<connection_type>/zero_copy_chunks/<operation>

where:

<operation> is one of the following: sent, received

<connection_type> is one of the following: tcp, mpi

Counter instance formatting

locality#*/total

where * is the locality id of the locality the overall number of transmitted bytes should be queried for. The locality id is a (zero based) number identifying the locality.

Description

Returns the maximum number of zero-copy chunks sent or received per message (see <operation>, e.g. sent or received) for the specified <connection_type>.

The performance counters are available only if the compile time constant HPX_HAVE_PARCELPORT_COUNTERS was defined while compiling the HPX core library (which is not defined by default). The corresponding cmake configuration constant is HPX_WITH_PARCELPORT_COUNTERS.

The performance counters for the connection type mpi are available only if the compile time constant HPX_HAVE_PARCELPORT_MPI was defined while compiling the HPX core library (which is not defined by default). The corresponding cmake configuration constant is HPX_WITH_PARCELPORT_MPI.

Please see CMake options for more details.

Table 46 Parcel layer performance counter /parcelport/size/<connection_type>/zero_copy_chunks/<operation>#

Counter type

/parcelport/size/<connection_type>/zero_copy_chunks/<operation>

where:

<operation> is one of the following: sent, received

<connection_type> is one of the following: tcp, mpi

Counter instance formatting

locality#*/total

where * is the locality id of the locality the overall number of transmitted bytes should be queried for. The locality id is a (zero based) number identifying the locality.

Description

Returns the overall size of zero-copy chunks sent or received (see <operation>, e.g. sent or received) for the specified <connection_type>.

The performance counters are available only if the compile time constant HPX_HAVE_PARCELPORT_COUNTERS was defined while compiling the HPX core library (which is not defined by default). The corresponding cmake configuration constant is HPX_WITH_PARCELPORT_COUNTERS.

The performance counters for the connection type mpi are available only if the compile time constant HPX_HAVE_PARCELPORT_MPI was defined while compiling the HPX core library (which is not defined by default). The corresponding cmake configuration constant is HPX_WITH_PARCELPORT_MPI.

Please see CMake options for more details.

Table 47 Parcel layer performance counter /parcelport/size-max/<connection_type>/zero_copy_chunks/<operation>#

Counter type

/parcelport/size-max/<connection_type>/zero_copy_chunks/<operation>

where:

<operation> is one of the following: sent, received

<connection_type> is one of the following: tcp, mpi

Counter instance formatting

locality#*/total

where * is the locality id of the locality the overall number of transmitted bytes should be queried for. The locality id is a (zero based) number identifying the locality.

Description

Returns the maximum size of zero-copy chunks sent or received (see <operation>, e.g. sent or received) for the specified <connection_type>.

The performance counters are available only if the compile time constant HPX_HAVE_PARCELPORT_COUNTERS was defined while compiling the HPX core library (which is not defined by default). The corresponding cmake configuration constant is HPX_WITH_PARCELPORT_COUNTERS.

The performance counters for the connection type mpi are available only if the compile time constant HPX_HAVE_PARCELPORT_MPI was defined while compiling the HPX core library (which is not defined by default). The corresponding cmake configuration constant is HPX_WITH_PARCELPORT_MPI.

Please see CMake options for more details.

Table 48 Parcel layer performance counter /parcelport/count/<connection_type>/<cache_statistics>#

Counter type

/parcelport/count/<connection_type>/<cache_statistics>

where:

<cache_statistics> is one of the following: cache/insertions, cache/evictions, cache/hits, cache/misses

<connection_type> is one of the following: tcp, mpi

Counter instance formatting

locality#*/total

where * is the locality id of the locality the number of messages should be queried for. The locality id is a (zero based) number identifying the locality.

Description

Returns the overall number cache events (evictions, hits, inserts, misses, and reclaims) for the connection cache of the given connection type on the given locality (see <cache_statistics, e.g. ache/insertions, cache/evictions, cache/hits, cache/misses or``cache/reclaims``.

The performance counters for the connection type mpi are available only if the compile time constant HPX_HAVE_PARCELPORT_MPI was defined while compiling the HPX core library (which is not defined by default). The corresponding cmake configuration constant is HPX_WITH_PARCELPORT_MPI.

Please see CMake options for more details.

Table 49 Parcel layer performance counter /parcelqueue/length/<operation>#

Counter type

/parcelqueue/length/<operation>

where <operation> is one of the following: sent, receive

Counter instance formatting

locality#*/total

where * is the locality id of the locality the parcel queue should be queried. The locality id is a (zero based) number identifying the locality.

Description

Returns the current number of parcels stored in the parcel queue (see <operation> for which queue to query, e.g. sent or received).

Table 50 Thread manager performance counter /threads/count/cumulative#

Counter type

/threads/count/cumulative

Counter instance formatting

locality#*/total or

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the overall number of retired HPX-threads should be queried for. The locality id (given by the *) is a (zero based) number identifying the locality.

pool#* is defining the pool for which the current value of the

idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the overall

number of retired HPX-threads should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

Description

Returns the overall number of executed (retired) HPX-threads on the given locality since application start. If the instance name is total the counter returns the accumulated number of retired HPX-threads for all worker threads (cores) on that locality. If the instance name is worker-thread#* the counter will return the overall number of retired HPX-threads for all worker threads separately. This counter is available only if the configuration time constant HPX_WITH_THREAD_CUMULATIVE_COUNTS is set to ON (default: ON).

Table 51 Thread manager performance counter /threads/time/average#

Counter type

/threads/time/average

Counter instance formatting

locality#*/total or

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the average time spent executing one HPX-thread should be queried for. The locality id (given by the *) is a (zero based) number identifying the locality.

pool#* is defining the pool for which the current value of the idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the average time spent executing one HPX-thread should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

Description

Returns the average time spent executing one HPX-thread on the given locality since application start. If the instance name is total the counter returns the average time spent executing one HPX-thread for all worker threads (cores) on that locality. If the instance name is worker-thread#* the counter will return the average time spent executing one HPX-thread for all worker threads separately. This counter is available only if the configuration time constants HPX_WITH_THREAD_CUMULATIVE_COUNTS (default: ON) and HPX_WITH_THREAD_IDLE_RATES are set to ON (default: OFF). The unit of measure for this counter is nanosecond [ns].

Table 52 Thread manager performance counter /threads/time/average-overhead#

Counter type

/threads/time/average-overhead

Counter instance formatting

locality#*/total or

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the average overhead spent executing one HPX-thread should be queried for. The locality id (given by the *) is a (zero based) number identifying the locality.

pool#* is defining the pool for which the current value of the idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the average overhead spent executing one HPX-thread should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

Description

Returns the average time spent on overhead while executing one HPX-thread on the given locality since application start. If the instance name is total the counter returns the average time spent on overhead while executing one HPX-thread for all worker threads (cores) on that locality. If the instance name is worker-thread#* the counter will return the average time spent on overhead executing one HPX-thread for all worker threads separately. This counter is available only if the configuration time constants HPX_WITH_THREAD_CUMULATIVE_COUNTS (default: ON) and HPX_WITH_THREAD_IDLE_RATES are set to ON (default: OFF). The unit of measure for this counter is nanosecond [ns].

Table 53 Thread manager performance counter /threads/count/cumulative-phases#

Counter type

/threads/count/cumulative-phases

Counter instance formatting

locality#*/total or

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the overall number of executed HPX-thread phases (invocations) should be queried for. The locality id (given by the *) is a (zero based) number identifying the locality.

pool#* is defining the pool for which the current value of the idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the overall number of executed HPX-thread phases (invocations) should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

Description

Returns the overall number of executed HPX-thread phases (invocations) on the given locality since application start. If the instance name is total the counter returns the accumulated number of executed HPX-thread phases (invocations) for all worker threads (cores) on that locality. If the instance name is worker-thread#* the counter will return the overall number of executed HPX-thread phases for all worker threads separately. This counter is available only if the configuration time constant HPX_WITH_THREAD_CUMULATIVE_COUNTS is set to ON (default: ON). The unit of measure for this counter is nanosecond [ns].

Table 54 Thread manager performance counter /threads/time/average-phase#

Counter type

/threads/time/average-phase

Counter instance formatting

locality#*/total or

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the average time spent executing one HPX-thread phase (invocation) should be queried for. The locality id (given by the *) is a (zero based) number identifying the locality.

pool#* is defining the pool for which the current value of the idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the average time executing one HPX-thread phase (invocation) should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

Description

Returns the average time spent executing one HPX-thread phase (invocation) on the given locality since application start. If the instance name is total the counter returns the average time spent executing one HPX-thread phase (invocation) for all worker threads (cores) on that locality. If the instance name is worker-thread#* the counter will return the average time spent executing one HPX-thread phase for all worker threads separately. This counter is available only if the configuration time constants HPX_WITH_THREAD_CUMULATIVE_COUNTS (default: ON) and HPX_WITH_THREAD_IDLE_RATES are set to ON (default: OFF). The unit of measure for this counter is nanosecond [ns].

Table 55 Thread manager performance counter /threads/time/average-phase-overhead#

Counter type

/threads/time/average-phase-overhead

Counter instance formatting

locality#*/total or

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the average time overhead executing one HPX-thread phase (invocation) should be queried for. The locality id (given by the *) is a (zero based) number identifying the locality.

pool#* is defining the pool for which the current value of the idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the average overhead executing one HPX-thread phase (invocation) should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

Description

Returns the average time spent on overhead executing one HPX-thread phase (invocation) on the given locality since application start. If the instance name is total the counter returns the average time spent on overhead while executing one HPX-thread phase (invocation) for all worker threads (cores) on that locality. If the instance name is worker-thread#* the counter will return the average time spent on overhead executing one HPX-thread phase for all worker threads separately. This counter is available only if the configuration time constants HPX_WITH_THREAD_CUMULATIVE_COUNTS (default: ON) and HPX_WITH_THREAD_IDLE_RATES are set to ON (default: OFF). The unit of measure for this counter is nanosecond [ns].

Table 56 Thread manager performance counter /threads/time/overall#

Counter type

/threads/time/overall

Counter instance formatting

locality#*/total or

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the overall time spent running the scheduler should be queried for. The locality id (given by the *) is a (zero based) number identifying the locality.

pool#* is defining the pool for which the current value of the idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the overall time spent running the scheduler should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

Description

Returns the overall time spent running the scheduler on the given locality since application start. If the instance name is total the counter returns the overall time spent running the scheduler for all worker threads (cores) on that locality. If the instance name is worker-thread#* the counter will return the overall time spent running the scheduler for all worker threads separately. This counter is available only if the configuration time constant HPX_WITH_THREAD_IDLE_RATES is set to ON (default: OFF). The unit of measure for this counter is nanosecond [ns].

Table 57 Thread manager performance counter /threads/time/cumulative#

Counter type

/threads/time/cumulative

Counter instance formatting

locality#*/total or

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the overall time spent executing all HPX-threads should be queried for. The locality id (given by the *) is a (zero based) number identifying the locality.

pool#* is defining the pool for which the current value of the idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the overall time spent executing all HPX-threads should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

Description

Returns the overall time spent executing all HPX-threads on the given locality since application start. If the instance name is total the counter returns the overall time spent executing all HPX-threads for all worker threads (cores) on that locality. If the instance name is worker-thread#* the counter will return the overall time spent executing all HPX-threads for all worker threads separately. This counter is available only if the configuration time constants HPX_THREAD_MAINTAIN_CUMULATIVE_COUNTS (default: ON) and HPX_THREAD_MAINTAIN_IDLE_RATES are set to ON (default: OFF).

Table 58 Thread manager performance counter /threads/time/cumulative-overheads#

Counter type

/threads/time/cumulative-overheads

Counter instance formatting

locality#*/total or

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the overall overhead time incurred by executing all HPX-threads should be queried for. The locality id (given by the *) is a (zero based) number identifying the locality.

pool#* is defining the pool for which the current value of the idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the the overall overhead time incurred by executing all HPX-threads should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

Description

Returns the overall overhead time incurred executing all HPX-threads on the given locality since application start. If the instance name is total the counter returns the overall overhead time incurred executing all HPX-threads for all worker threads (cores) on that locality. If the instance name is worker-thread#* the counter will return the overall overhead time incurred executing all HPX-threads for all worker threads separately. This counter is available only if the configuration time constants HPX_THREAD_MAINTAIN_CUMULATIVE_COUNTS (default: ON) and HPX_THREAD_MAINTAIN_IDLE_RATES are set to ON (default: OFF). The unit of measure for this counter is nanosecond [ns].

Table 59 Thread manager performance counter threads/count/instantaneous/<thread-state>#

Counter type

threads/count/instantaneous/<thread-state>

where:

<thread-state> is one of the following: all, active, pending, suspended, terminated, staged

Counter instance formatting

locality#*/total or

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the current number of threads with the given state should be queried for. The locality id (given by the *) is a (zero based) number identifying the locality.

pool#* is defining the pool for which the current value of the idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the current number of threads with the given state should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

The staged thread state refers to registered tasks before they are converted to thread objects.

Description

Returns the current number of HPX-threads having the given thread state on the given locality. If the instance name is total the counter returns the current number of HPX-threads of the given state for all worker threads (cores) on that locality. If the instance name is worker-thread#* the counter will return the current number of HPX-threads in the given state for all worker threads separately.

Table 60 Thread manager performance counter threads/wait-time/<thread-state>#

Counter type

threads/wait-time/<thread-state>

where:

<thread-state> is one of the following: pending staged

Counter instance formatting

locality#*/total or

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the average wait time of HPX-threads (pending) or thread descriptions (staged) with the given state should be queried for. The locality id (given by * is a (zero based) number identifying the locality.

pool#* is defining the pool for which the current value of the idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the average wait time for the given state should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

The staged thread state refers to the wait time of registered tasks before they are converted into thread objects, while the pending thread state refers to the wait time of threads in any of the scheduling queues.

Description

Returns the average wait time of HPX-threads (if the thread state is pending or of task descriptions (if the thread state is staged on the given locality since application start. If the instance name is total the counter returns the wait time of HPX-threads of the given state for all worker threads (cores) on that locality. If the instance name is worker-thread#* the counter will return the wait time of HPX-threads in the given state for all worker threads separately.

These counters are available only if the compile time constant HPX_WITH_THREAD_QUEUE_WAITTIME was defined while compiling the HPX core library (default: OFF). The unit of measure for this counter is nanosecond [ns].

Table 61 Thread manager performance counter /threads/idle-rate#

Counter type

/threads/idle-rate

Counter instance formatting

locality#*/total or

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the average idle rate of all (or one) worker threads should be queried for. The locality id (given by the *) is a (zero based) number identifying the locality

pool#* is defining the pool for which the current value of the idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the averaged idle rate should be queried for. The worker thread number (given by the * is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

Description

Returns the average idle rate for the given worker thread(s) on the given locality. The idle rate is defined as the ratio of the time spent on scheduling and management tasks and the overall time spent executing work since the application started. This counter is available only if the configuration time constant HPX_WITH_THREAD_IDLE_RATES is set to ON (default: OFF).

Table 62 Thread manager performance counter /threads/creation-idle-rate#

Counter type

/threads/creation-idle-rate

Counter instance formatting

locality#*/total or

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the average creation idle rate of all (or one) worker threads should be queried for. The locality id (given by the *) is a (zero based) number identifying the locality.

pool#* is defining the pool for which the current value of the idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the averaged idle rate should be queried for. The worker thread number (given by the * is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

Description

Returns the average idle rate for the given worker thread(s) on the given locality which is caused by creating new threads. The creation idle rate is defined as the ratio of the time spent on creating new threads and the overall time spent executing work since the application started. This counter is available only if the configuration time constants HPX_WITH_THREAD_IDLE_RATES (default: OFF) and HPX_WITH_THREAD_CREATION_AND_CLEANUP_RATES are set to ON.

Table 63 Thread manager performance counter /threads/cleanup-idle-rate#

Counter type

/threads/cleanup-idle-rate

Counter instance formatting

locality#*/total or

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the average cleanup idle rate of all (or one) worker threads should be queried for. The locality id (given by the *) is a (zero based) number identifying the locality.

pool#* is defining the pool for which the current value of the idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the averaged cleanup idle rate should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

Description

Returns the average idle rate for the given worker thread(s) on the given locality which is caused by cleaning up terminated threads. The cleanup idle rate is defined as the ratio of the time spent on cleaning up terminated thread objects and the overall time spent executing work since the application started. This counter is available only if the configuration time constants HPX_WITH_THREAD_IDLE_RATES (default: OFF) and HPX_WITH_THREAD_CREATION_AND_CLEANUP_RATES are set to ON.

Table 64 Thread manager performance counter /threadqueue/length#

Counter type

/threadqueue/length

Counter instance formatting

locality#*/total or

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the current length of all thread queues in the scheduler for all (or one) worker threads should be queried for. The locality id (given by the *) is a (zero based) number identifying the locality.

pool#* is defining the pool for which the current value of the idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the current length of all thread queues in the scheduler should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

Description

Returns the overall length of all queues for the given worker thread(s) on the given locality.

Table 65 Thread manager performance counter /threads/count/stack-unbinds#

Counter type

/threads/count/stack-unbinds

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the unbind (madvise) operations should be queried for. The locality id is a (zero based) number identifying the locality.

Description

Returns the total number of HPX-thread unbind (madvise) operations performed for the referenced locality. Note that this counter is not available on Windows based platforms.

Table 66 Thread manager performance counter /threads/count/stack-recycles#

Counter type

/threads/count/stack-recycles

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the recycling operations should be queried for. The locality id is a (zero based) number identifying the locality.

Description

Returns the total number of HPX-thread recycling operations performed.

Table 67 Thread manager performance counter /threads/count/stolen-from-pending#

Counter type

/threads/count/stolen-from-pending

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the number of ‘stole’ threads should be queried for. The locality id is a (zero based) number identifying the locality.

Description

Returns the total number of HPX-threads ‘stolen’ from the pending thread queue by a neighboring thread worker thread (these threads are executed by a different worker thread than they were initially scheduled on). This counter is available only if the configuration time constant HPX_WITH_THREAD_STEALING_COUNTS is set to ON (default: ON).

Table 68 Thread manager performance counter /threads/count/pending-misses#

Counter type

/threads/count/pending-misses

Counter instance formatting

locality#*/total or

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the number of pending queue misses of all (or one) worker threads should be queried for. The locality id (given by the *) is a (zero based) number identifying the locality

pool#* is defining the pool for which the current value of the idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the number of pending queue misses should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

Description

Returns the total number of times that the referenced worker-thread on the referenced locality failed to find pending HPX-threads in its associated queue. This counter is available only if the configuration time constant HPX_WITH_THREAD_STEALING_COUNTS is set to ON (default: ON).

Table 69 Thread manager performance counter /threads/count/pending-accesses#

Counter type

/threads/count/pending-accesses

Counter instance formatting

locality#*/total or

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the number of pending queue accesses of all (or one) worker threads should be queried for. The locality id (given by the *) is a (zero based) number identifying the locality

pool#* is defining the pool for which the current value of the idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the number of pending queue accesses should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

Description

Returns the total number of times that the referenced worker-thread on the referenced locality looked for pending HPX-threads in its associated queue. This counter is available only if the configuration time constant HPX_WITH_THREAD_STEALING_COUNTS is set to ON (default: ON).

Table 70 Thread manager performance counter /threads/count/stolen-from-staged#

Counter type

/threads/count/stolen-from-staged

Counter instance formatting

locality#*/total or

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the number of HPX-threads stolen from the staged queue of all (or one) worker threads should be queried for. The locality id (given by the *) is a (zero based) number identifying the locality.

pool#* is defining the pool for which the current value of the idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the number of HPX-threads stolen from the staged queue should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

Description

Returns the total number of HPX-threads ‘stolen’ from the staged thread queue by a neighboring worker thread (these threads are executed by a different worker thread than they were initially scheduled on). This counter is available only if the configuration time constant HPX_WITH_THREAD_STEALING_COUNTS is set to ON (default: ON).

Table 71 Thread manager performance counter /threads/count/stolen-to-pending#

Counter type

/threads/count/stolen-to-pending

Counter instance formatting

locality#*/total or

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the number of HPX-threads stolen to the pending queue of all (or one) worker threads should be queried for. The locality id (given by the *) is a (zero based) number identifying the locality.

pool#* is defining the pool for which the current value of the idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the number of HPX-threads stolen to the pending queue should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

Description

Returns the total number of HPX-threads ‘stolen’ to the pending thread queue of the worker thread (these threads are executed by a different worker thread than they were initially scheduled on). This counter is available only if the configuration time constant HPX_WITH_THREAD_STEALING_COUNTS is set to ON (default: ON).

Table 72 Thread manager performance counter /threads/count/stolen-to-staged#

Counter type

/threads/count/stolen-to-staged

Counter instance formatting

locality#*/total or

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the number of HPX-threads stolen to the staged queue of all (or one) worker threads should be queried for. The locality id (given by *) is a (zero based) number identifying the locality.

pool#* is defining the pool for which the current value of the idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the number of HPX-threads stolen to the staged queue should be queried for. The worker thread number (given by the *) is a (zero based) worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

Description

Returns the total number of HPX-threads ‘stolen’ to the staged thread queue of a neighboring worker thread (these threads are executed by a different worker thread than they were initially scheduled on). This counter is available only if the configuration time constant HPX_WITH_THREAD_STEALING_COUNTS is set to ON (default: ON).

Table 73 Thread manager performance counter /threads/count/objects#

Counter type

/threads/count/objects

Counter instance formatting

locality#*/total or

locality#*/allocator#*

where:

locality#* is defining the locality for which the current (cumulative) number of all created HPX-thread objects should be queried for. The locality id (given by *) is a (zero based) number identifying the locality.

allocator#* is defining the number of the allocator instance using which the threads have been created. HPX uses a varying number of allocators to create (and recycle) HPX-thread objects, most likely these counters are of use for debugging purposes only. The allocator id (given by *) is a (zero based) number identifying the allocator to query.

Description

Returns the total number of HPX-thread objects created. Note that thread objects are reused to improve system performance, thus this number does not reflect the number of actually executed (retired) HPX-threads.

Table 74 Thread manager performance counter /scheduler/utilization/instantaneous#

Counter type

/scheduler/utilization/instantaneous

Counter instance formatting

locality#*/total

where:

locality#* is defining the locality for which the current (instantaneous) scheduler utilization queried for. The locality id (given by *) is a (zero based) number identifying the locality.

Description

Returns the total (instantaneous) scheduler utilization. This is the

current percentage of scheduler threads executing HPX threads.

Parameters

Percent

Table 75 Thread manager performance counter /threads/idle-loop-count/instantaneous#

Counter type

/threads/idle-loop-count/instantaneous

Counter instance formatting

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the current current accumulated value of all idle-loop counters of all worker threads should be queried. The locality id (given by the *) is a (zero based) number identifying the locality.

pool#* is defining the pool for which the current value of the idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the current value of the idle-loop counter should be queried for. The worker thread number (given by the *) is a (zero based) worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

Description

Returns the current (instantaneous) idle-loop count for the given HPX- worker thread or the accumulated value for all worker threads.

Table 76 Thread manager performance counter /threads/busy-loop-count/instantaneous#

Counter type

/threads/busy-loop-count/instantaneous

Counter instance formatting

locality#*/worker-thread#* or

locality#*/pool#*/worker-thread#*

where:

locality#* is defining the locality for which the current current accumulated value of all busy-loop counters of all worker threads should be queried. The locality id (given by the *) is a (zero based) number identifying the locality.

pool#* is defining the pool for which the current value of the idle-loop counter should be queried for.

worker-thread#* is defining the worker thread for which the current value of the busy-loop counter should be queried for. The worker thread number (given by the *) is a (zero based) worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads. If no pool-name is specified the counter refers to the ‘default’ pool.

Description

Returns the current (instantaneous) busy-loop count for the given HPX- worker thread or the accumulated value for all worker threads.


Table 77 Thread manager performance counter /threads/time/background-work-duration#

Counter type

/threads/time/background-work-duration

Counter instance formatting

locality#*/total or

locality#*/worker-thread#*

where:

locality#* is defining the locality for which the overall time spent performing background work should be queried for. The locality id (given by *) is a (zero based) number identifying the locality.

worker-thread#* is defining the worker thread for which the overall time spent performing background work should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads.

Description

Returns the overall time spent performing background work on the given locality since application start. If the instance name is total the counter returns the overall time spent performing background work for all worker threads (cores) on that locality. If the instance name is worker-thread#* the counter will return the overall time spent performing background work for all worker threads separately. This counter is available only if the configuration time constants HPX_WITH_BACKGROUND_THREAD_COUNTERS (default: OFF) and HPX_WITH_THREAD_IDLE_RATES are set to ON (default: OFF). The unit of measure for this counter is nanosecond [ns].

Table 78 Thread manager performance counter /threads/background-overhead#

Counter type

/threads/background-overhead

Counter instance formatting

locality#*/total or

locality#*/worker-thread#*

where:

locality#* is defining the locality for which the background overhead should be queried for. The locality id (given by *) is a (zero based) number identifying the locality.

worker-thread#* is defining the worker thread for which the background overhead should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads.

Description

Returns the background overhead on the given locality since application start. If the instance name is total the counter returns the background overhead for all worker threads (cores) on that locality. If the instance name is worker-thread#* the counter will return background overhead for all worker threads separately. This counter is available only if the configuration time constants HPX_WITH_BACKGROUND_THREAD_COUNTERS (default: OFF) and HPX_WITH_THREAD_IDLE_RATES are set to ON (default: OFF). The unit of measure displayed for this counter is 0.1%.

Table 79 Thread manager performance counter /threads/time/background-send-duration#

Counter type

/threads/time/background-send-duration

Counter instance formatting

locality#*/total or

locality#*/worker-thread#*

where:

locality#* is defining the locality for which the overall time spent performing background work related to sending parcels should be queried for. The locality id (given by *) is a (zero based) number identifying the locality.

worker-thread#* is defining the worker thread for which the overall time spent performing background work related to sending parcels should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads.

Description

Returns the overall time spent performing background work related to sending parcels on the given locality since application start. If the instance name is total the counter returns the overall time spent performing background work for all worker threads (cores) on that locality. If the instance name is worker-thread#* the counter will return the overall time spent performing background work for all worker threads separately. This counter is available only if the configuration time constants HPX_WITH_BACKGROUND_THREAD_COUNTERS (default: OFF) and HPX_WITH_THREAD_IDLE_RATES are set to ON (default: OFF). The unit of measure for this counter is nanosecond [ns].

This counter will currently return meaningful values for the MPI parcelport only.

Table 80 Thread manager performance counter /threads/background-send-overhead#

Counter type

/threads/background-send-overhead

Counter instance formatting

locality#*/total or

locality#*/worker-thread#*

where:

locality#* is defining the locality for which the background overhead related to sending parcels should be queried for. The locality id (given by *) is a (zero based) number identifying the locality.

worker-thread#* is defining the worker thread for which the background overhead related to sending parcels should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads.

Description

Returns the background overhead related to sending parcels on the given locality since application start. If the instance name is total the counter returns the background overhead for all worker threads (cores) on that locality. If the instance name is worker-thread#* the counter will return background overhead for all worker threads separately. This counter is available only if the configuration time constants HPX_WITH_BACKGROUND_THREAD_COUNTERS (default: OFF) and HPX_WITH_THREAD_IDLE_RATES are set to ON (default: OFF). The unit of measure displayed for this counter is 0.1%.

This counter will currently return meaningful values for the MPI parcelport only.

Table 81 Thread manager performance counter /threads/time/background-receive-duration#

Counter type

/threads/time/background-receive-duration

Counter instance formatting

locality#*/total or

locality#*/worker-thread#*

where:

locality#* is defining the locality for which the overall time spent performing background work related to receiving parcels should be queried for. The locality id (given by *) is a (zero based) number identifying the locality.

worker-thread#* is defining the worker thread for which the overall time spent performing background work related to receiving parcels should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads.

Description

Returns the overall time spent performing background work related to receiving parcels on the given locality since application start. If the instance name is total the counter returns the overall time spent performing background work for all worker threads (cores) on that locality. If the instance name is worker-thread#* the counter will return the overall time spent performing background work for all worker threads separately. This counter is available only if the configuration time constants HPX_WITH_BACKGROUND_THREAD_COUNTERS (default: OFF) and HPX_WITH_THREAD_IDLE_RATES are set to ON (default: OFF). The unit of measure for this counter is nanosecond [ns].

This counter will currently return meaningful values for the MPI parcelport only.

Table 82 Thread manager performance counter /threads/background-receive-overhead#

Counter type

/threads/background-receive-overhead

Counter instance formatting

locality#*/total or

locality#*/worker-thread#*

where:

locality#* is defining the locality for which the background overhead related to receiving should be queried for. The locality id (given by *) is a (zero based) number identifying the locality.

worker-thread#* is defining the worker thread for which the background overhead related to receiving parcels should be queried for. The worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads.

Description

Returns the background overhead related to receiving parcels on the given locality since application start. If the instance name is total the counter returns the background overhead for all worker threads (cores) on that locality. If the instance name is worker-thread#* the counter will return background overhead for all worker threads separately. This counter is available only if the configuration time constants HPX_WITH_BACKGROUND_THREAD_COUNTERS (default: OFF) and HPX_WITH_THREAD_IDLE_RATES are set to ON (default: OFF). The unit of measure displayed for this counter is 0.1%.

This counter will currently return meaningful values for the MPI parcelport only.

Table 83 General performance counter /runtime/count/component#

Counter type

/runtime/count/component

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the number of components should be queried. The locality id is a (zero based) number identifying the locality.

Description

Returns the overall number of currently active components of the specified type on the given locality.

Parameters

The type of the component. This is the string which has been used while registering the component with HPX, e.g. which has been passed as the second parameter to the macro HPX_REGISTER_COMPONENT.

Table 84 General performance counter /runtime/count/action-invocation#

Counter type

/runtime/count/action-invocation

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the number of action invocations should be queried. The locality id is a (zero based) number identifying the locality.

Description

Returns the overall (local) invocation count of the specified action type on the given locality.

Parameters

The action type. This is the string which has been used while registering the action with HPX, e.g. which has been passed as the second parameter to the macro HPX_REGISTER_ACTION or HPX_REGISTER_ACTION_ID.

Table 85 General performance counter /runtime/count/remote-action-invocation#

Counter type

/runtime/count/remote-action-invocation

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the number of action invocations should be queried. The locality id is a (zero based) number identifying the locality.

Description

Returns the overall (remote) invocation count of the specified action type on the given locality.

Parameters

The action type. This is the string which has been used while registering the action with HPX, e.g. which has been passed as the second parameter to the macro HPX_REGISTER_ACTION or HPX_REGISTER_ACTION_ID.

Table 86 General performance counter /runtime/uptime#

Counter type

/runtime/uptime

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the system uptime should be queried. The locality id is a (zero based) number identifying the locality.

Description

Returns the overall time since application start on the given locality in nanoseconds.

Table 87 General performance counter /runtime/memory/virtual#

Counter type

/runtime/memory/virtual

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the allocated virtual memory should be queried. The locality id is a (zero based) number identifying the locality.

Description

Returns the amount of virtual memory currently allocated by the referenced locality (in bytes).

Table 88 General performance counter /runtime/memory/resident#

Counter type

/runtime/memory/resident

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the allocated resident memory should be queried. The locality id is a (zero based) number identifying the locality.

Description

Returns the amount of resident memory currently allocated by the referenced locality (in bytes).

Table 89 General performance counter /runtime/memory/total#

Counter type

/runtime/memory/total

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the total available memory should be queried. The locality id is a (zero based) number identifying the locality. Note: only supported in Linux.

Description

Returns the total available memory for use by the referenced

locality (in bytes). This counter is available on Linux and Windows systems only.

Table 90 General performance counter /runtime/io/read_bytes_issued#

Counter type

/runtime/io/read_bytes_issued

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the number of bytes read should be queried. The locality id is a (zero based) number identifying the locality.

Description

Returns the number of bytes read by the process (aggregate of count arguments passed to read() call or its analogues). This performance counter is available only on systems which expose the related data through the /proc file system.

Table 91 General performance counter /runtime/io/write_bytes_issued#

Counter type

/runtime/io/write_bytes_issued

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the number of bytes written should be queried. The locality id is a (zero based) number identifying the locality.

Description

Returns the number of bytes written by the process (aggregate of count arguments passed to write() call or its analogues). This performance counter is available only on systems which expose the related data through the /proc file system.

Table 92 General performance counter /runtime/io/read_syscalls#

Counter type

/runtime/io/read_syscalls

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the number of system calls should be queried. The locality id is a (zero based) number identifying the locality.

Description

Returns the number of system calls that perform I/O reads. This performance counter is available only on systems which expose the related data through the /proc file system.

Table 93 General performance counter /runtime/io/write_syscalls#

Counter type

/runtime/io/write_syscalls

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the number of system calls should be queried. The locality id is a (zero based) number identifying the locality.

Description

Returns the number of system calls that perform I/O writes. This performance counter is available only on systems which expose the related data through the /proc file system.

Table 94 General performance counter /runtime/io/read_bytes_transferred#

Counter type

/runtime/io/read_bytes_transferred

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the number of bytes transferred should be queried. The locality id is a (zero based) number identifying the locality.

Description

Returns the number of bytes retrieved from storage by I/O operations. This performance counter is available only on systems which expose the related data through the /proc file system.

Table 95 General performance counter /runtime/io/write_bytes_transferred#

Counter type

/runtime/io/write_bytes_transferred

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the number of bytes transferred should be queried. The locality id is a (zero based) number identifying the locality.

Description

Returns the number of bytes retrieved from storage by I/O operations. This performance counter is available only on systems which expose the related data through the /proc file system.

Table 96 General performance counter /runtime/io/write_bytes_cancelled#

Counter type

/runtime/io/write_bytes_cancelled

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the number of bytes not being transferred should be queried. The locality id is a (zero based) number identifying the locality.

Description

Returns the number of bytes accounted by write_bytes_transferred that has not been ultimately stored due to truncation or deletion. This performance counter is available only on systems which expose the related data through the /proc file system.

Table 97 Performance counter /papi/<papi_event>#

Counter type

/papi/<papi_event>

where:

<papi_event> is the name of the PAPI event to expose as a performance counter (such as PAPI_SR_INS). Note that the list of available PAPI events changes depending on the used architecture.

For a full list of available PAPI events and their (short) description use the --hpx:list-counters and --hpx:papi-event-info=all command line options.

Counter instance formatting

locality#*/total or

locality#*/worker-thread#*

where:

locality#* is defining the locality for which the current current accumulated value of all busy-loop counters of all worker threads should be queried. The locality id (given by *) is a (zero based) number identifying the locality.

worker-thread#* is defining the worker thread for which the current value of the busy-loop counter should be queried for. The worker thread number (given by the *) is a (zero based) worker thread number (given by the *) is a (zero based) number identifying the worker thread. The number of available worker threads is usually specified on the command line for the application using the option --hpx:threads.

Description

Returns the current count of occurrences of the specified PAPI event. This counter is available only if the configuration time constant HPX_WITH_PAPI is set to ON (default: OFF).

Table 98 Performance counter /statistics/average#

Counter type

/statistics/average

Counter instance formatting

Any full performance counter name. The referenced performance counter is queried at fixed time intervals as specified by the first parameter.

Description

Returns the current average (mean) value calculated based on the values queried from the underlying counter (the one specified as the instance name).

Parameters

Any parameter will be interpreted as a list of up to two comma separated (integer) values, where the first is the time interval (in milliseconds) at which the underlying counter should be queried. If no value is specified, the counter will assume 1000 [ms] as the default. The second value can be either 0 or 1 and specifies whether the underlying counter should be reset during evaluation 1 or not 0. The default value is 0.

Table 99 Performance counter /statistics/rolling_average#

Counter type

/statistics/rolling_average

Counter instance formatting

Any full performance counter name. The referenced performance counter is queried at fixed time intervals as specified by the first parameter.

Description

Returns the current rolling average (mean) value calculated based on the values queried from the underlying counter (the one specified as the instance name).

Parameters

Any parameter will be interpreted as a list of up to three comma separated (integer) values, where the first is the time interval (in milliseconds) at which the underlying counter should be queried. If no value is specified, the counter will assume 1000 [ms] as the default. The second value will be interpreted as the size of the rolling window (the number of latest values to use to calculate the rolling average). The default value for this is 10. The third value can be either 0 or 1 and specifies whether the underlying counter should be reset during evaluation 1 or not 0. The default value is 0.

Table 100 Performance counter /statistics/stddev#

Counter type

/statistics/stddev

Counter instance formatting

Any full performance counter name. The referenced performance counter is queried at fixed time intervals as specified by the first parameter.

Description

Returns the current standard deviation (stddev) value calculated based on the values queried from the underlying counter (the one specified as the instance name).

Parameters

Any parameter will be interpreted as a list of up to two comma separated (integer) values, where the first is the time interval (in milliseconds) at which the underlying counter should be queried. If no value is specified, the counter will assume 1000 [ms] as the default. The second value can be either 0 or 1 and specifies whether the underlying counter should be reset during evaluation 1 or not 0. The default value is 0.

Table 101 Performance counter /statistics/rolling_stddev#

Counter type

/statistics/rolling_stddev

Counter instance formatting

Any full performance counter name. The referenced performance counter is queried at fixed time intervals as specified by the first parameter.

Description

Returns the current rolling variance (stddev) value calculated based on the values queried from the underlying counter (the one specified as the instance name).

Parameters

Any parameter will be interpreted as a list of up to three comma separated (integer) values, where the first is the time interval (in milliseconds) at which the underlying counter should be queried. If no value is specified, the counter will assume 1000 [ms] as the default. The second value will be interpreted as the size of the rolling window (the number of latest values to use to calculate the rolling average). The default value for this is 10. The third value can be either 0 or 1 and specifies whether the underlying counter should be reset during evaluation 1 or not 0. The default value is 0.

Table 102 Performance counter /statistics/median#

Counter type

/statistics/median

Counter instance formatting

Any full performance counter name. The referenced performance counter is queried at fixed time intervals as specified by the first parameter.

Description

Returns the current (statistically estimated) median value calculated based on the values queried from the underlying counter (the one specified as the instance name).

Parameters

Any parameter will be interpreted as a list of up to two comma separated (integer) values, where the first is the time interval (in milliseconds) at which the underlying counter should be queried. If no value is specified, the counter will assume 1000 [ms] as the default. The second value can be either 0 or 1 and specifies whether the underlying counter should be reset during evaluation 1 or not 0. The default value is 0.

Table 103 Performance counter /statistics/max#

Counter type

/statistics/max

Counter instance formatting

Any full performance counter name. The referenced performance counter is queried at fixed time intervals as specified by the first parameter.

Description

Returns the current maximum value calculated based on the values queried from the underlying counter (the one specified as the instance name).

Parameters

Any parameter will be interpreted as a list of up to two comma separated (integer) values, where the first is the time interval (in milliseconds) at which the underlying counter should be queried. If no value is specified, the counter will assume 1000 [ms] as the default. The second value can be either 0 or 1 and specifies whether the underlying counter should be reset during evaluation 1 or not 0. The default value is 0.

Table 104 Performance counter /statistics/rolling_max#

Counter type

/statistics/rolling_max

Counter instance formatting

Any full performance counter name. The referenced performance counter is queried at fixed time intervals as specified by the first parameter.

Description

Returns the current rolling maximum value calculated based on the values queried from the underlying counter (the one specified as the instance name).

Parameters

Any parameter will be interpreted as a list of up to three comma separated (integer) values, where the first is the time interval (in milliseconds) at which the underlying counter should be queried. If no value is specified, the counter will assume 1000 [ms] as the default. The second value will be interpreted as the size of the rolling window (the number of latest values to use to calculate the rolling average). The default value for this is 10. The third value can be either 0 or 1 and specifies whether the underlying counter should be reset during evaluation 1 or not 0. The default value is 0.

Table 105 Performance counter /statistics/min#

Counter type

/statistics/min

Counter instance formatting

Any full performance counter name. The referenced performance counter is queried at fixed time intervals as specified by the first parameter.

Description

Returns the current minimum value calculated based on the values queried from the underlying counter (the one specified as the instance name).

Parameters

Any parameter will be interpreted as a list of up to two comma separated (integer) values, where the first is the time interval (in milliseconds) at which the underlying counter should be queried. If no value is specified, the counter will assume 1000 [ms] as the default. The second value can be either 0 or 1 and specifies whether the underlying counter should be reset during evaluation 1 or not 0. The default value is 0.

Table 106 Performance counter /statistics/rolling_min#

Counter type

/statistics/rolling_min

Counter instance formatting

Any full performance counter name. The referenced performance counter is queried at fixed time intervals as specified by the first parameter.

Description

Returns the current rolling minimum value calculated based on the values queried from the underlying counter (the one specified as the instance name).

Parameters

Any parameter will be interpreted as a list of up to three comma separated (integer) values, where the first is the time interval (in milliseconds) at which the underlying counter should be queried. If no value is specified, the counter will assume 1000 [ms] as the default. The second value will be interpreted as the size of the rolling window (the number of latest values to use to calculate the rolling average). The default value for this is 10. The third value can be either 0 or 1 and specifies whether the underlying counter should be reset during evaluation 1 or not 0. The default value is 0.

Table 107 Performance counter /arithmetics/add#

Counter type

/arithmetics/add

Description

Returns the sum calculated based on the values queried from the underlying counters (the ones specified as the parameters).

Parameters

The parameter will be interpreted as a comma separated list of full performance counter names which are queried whenever this counter is accessed. Any wildcards in the counter names will be expanded.

Table 108 Performance counter /arithmetics/subtract#

Counter type

/arithmetics/subtract

Description

Returns the difference calculated based on the values queried from the underlying counters (the ones specified as the parameters).

Parameters

The parameter will be interpreted as a comma separated list of full performance counter names which are queried whenever this counter is accessed. Any wildcards in the counter names will be expanded.

Table 109 Performance counter /arithmetics/multiply#

Counter type

/arithmetics/multiply

Description

Returns the product calculated based on the values queried from the underlying counters (the ones specified as the parameters).

Parameters

The parameter will be interpreted as a comma separated list of full performance counter names which are queried whenever this counter is accessed. Any wildcards in the counter names will be expanded.

Table 110 Performance counter /arithmetics/divide#

Counter type

/arithmetics/divide

Description

Returns the result of division of the values queried from the underlying counters (the ones specified as the parameters).

Parameters

The parameter will be interpreted as a comma separated list of full performance counter names which are queried whenever this counter is accessed. Any wildcards in the counter names will be expanded.

Table 111 Performance counter /arithmetics/mean#

Counter type

/arithmetics/mean

Description

Returns the average value of all values queried from the underlying counters (the ones specified as the parameters).

Parameters

The parameter will be interpreted as a comma separated list of full performance counter names which are queried whenever this counter is accessed. Any wildcards in the counter names will be expanded.

Table 112 Performance counter /arithmetics/variance#

Counter type

/arithmetics/variance

Description

Returns the standard deviation of all values queried from the underlying counters (the ones specified as the parameters).

Parameters

The parameter will be interpreted as a comma separated list of full performance counter names which are queried whenever this counter is accessed. Any wildcards in the counter names will be expanded.

Table 113 Performance counter /arithmetics/median#

Counter type

/arithmetics/median

Description

Returns the median value of all values queried from the underlying counters (the ones specified as the parameters).

Parameters

The parameter will be interpreted as a comma separated list of full performance counter names which are queried whenever this counter is accessed. Any wildcards in the counter names will be expanded.

Table 114 Performance counter /arithmetics/min#

Counter type

/arithmetics/min

Description

Returns the minimum value of all values queried from the underlying counters (the ones specified as the parameters).

Parameters

The parameter will be interpreted as a comma separated list of full performance counter names which are queried whenever this counter is accessed. Any wildcards in the counter names will be expanded.

Table 115 Performance counter /arithmetics/max#

Counter type

/arithmetics/max

Description

Returns the maximum value of all values queried from the underlying counters (the ones specified as the parameters).

Parameters

The parameter will be interpreted as a comma separated list of full performance counter names which are queried whenever this counter is accessed. Any wildcards in the counter names will be expanded.

Table 116 Performance counter /arithmetics/count#

Counter type

/arithmetics/count

Description

Returns the count value of all values queried from the underlying counters (the ones specified as the parameters).

Parameters

The parameter will be interpreted as a comma separated list of full performance counter names which are queried whenever this counter is accessed. Any wildcards in the counter names will be expanded.

Note

The /arithmetics counters can consume an arbitrary number of other counters. For this reason those have to be specified as parameters (a comma separated list of counters appended after a '@'). For instance:

$ ./bin/hello_world_distributed -t2 \
    --hpx:print-counter=/threads{locality#0/worker-thread#*}/count/cumulative \
    --hpx:print-counter=/arithmetics/add@/threads{locality#0/worker-thread#*}/count/cumulative
hello world from OS-thread 0 on locality 0
hello world from OS-thread 1 on locality 0
/threads{locality#0/worker-thread#0}/count/cumulative,1,0.515640,[s],25
/threads{locality#0/worker-thread#1}/count/cumulative,1,0.515520,[s],36
/arithmetics/add@/threads{locality#0/worker-thread#*}/count/cumulative,1,0.516445,[s],64

Since all wildcards in the parameters are expanded, this example is fully equivalent to specifying both counters separately to /arithmetics/add:

$ ./bin/hello_world_distributed -t2 \
    --hpx:print-counter=/threads{locality#0/worker-thread#*}/count/cumulative \
    --hpx:print-counter=/arithmetics/add@\
        /threads{locality#0/worker-thread#0}/count/cumulative,\
        /threads{locality#0/worker-thread#1}/count/cumulative
Table 117 Performance counter /coalescing/count/parcels#

Counter type

/coalescing/count/parcels

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the number of parcels for the given action should be queried for. The locality id is a (zero based) number identifying the locality.

Description

Returns the number of parcels handled by the message handler associated with the action which is given by the counter parameter.

Parameters

The action type. This is the string which has been used while registering the action with HPX, e.g. which has been passed as the second parameter to the macro HPX_REGISTER_ACTION or HPX_REGISTER_ACTION_ID.

Table 118 Performance counter /coalescing/count/messages#

Counter type

/coalescing/count/messages

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the number of messages for the given action should be queried for. The locality id is a (zero based) number identifying the locality.

Description

Returns the number of messages generated by the message handler associated with the action which is given by the counter parameter.

Parameters

The action type. This is the string which has been used while registering the action with HPX, e.g. which has been passed as the second parameter to the macro HPX_REGISTER_ACTION or HPX_REGISTER_ACTION_ID.

Table 119 Performance counter /coalescing/count/average-parcels-per-message#

Counter type

/coalescing/count/average-parcels-per-message

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the number of messages for the given action should be queried for. The locality id is a (zero based) number identifying the locality.

Description

Returns the average number of parcels sent in a message generated by the message handler associated with the action which is given by the counter parameter.

Parameters

The action type. This is the string which has been used while registering the action with HPX, e.g. which has been passed as the second parameter to the macro HPX_REGISTER_ACTION or HPX_REGISTER_ACTION_ID

Table 120 Performance counter /coalescing/time/average-parcel-arrival#

Counter type

/coalescing/time/average-parcel-arrival

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the average time between parcels for the given action should be queried for. The locality id is a (zero based) number identifying the locality.

Description

Returns the average time between arriving parcels for the action which is given by the counter parameter.

Parameters

The action type. This is the string which has been used while registering the action with HPX, e.g. which has been passed as the second parameter to the macro HPX_REGISTER_ACTION or HPX_REGISTER_ACTION_ID

Table 121 Performance counter /coalescing/time/parcel-arrival-histogram#

Counter type

/coalescing/time/parcel-arrival-histogram

Counter instance formatting

locality#*/total

where:

* is the locality id of the locality the average time between parcels for the given action should be queried for. The locality id is a (zero based) number identifying the locality.

Description

Returns a histogram representing the times between arriving parcels for the action which is given by the counter parameter.

This counter returns an array of values, where the first three values represent the three parameters used for the histogram followed by one value for each of the histogram buckets.

The first unit of measure displayed for this counter [ns] refers to the lower and upper boundary values in the returned histogram data only. The second unit of measure displayed [0.1%] refers to the actual histogram data.

For each bucket the counter shows a value between 0 and 1000 which corresponds to a percentage value between 0% and 100%.

Parameters

The action type and optional histogram parameters. The action type is the string which has been used while registering the action with HPX, e.g. which has been passed as the second parameter to the macro HPX_REGISTER_ACTION or HPX_REGISTER_ACTION_ID.

The action type may be followed by a comma separated list of up-to three numbers: the lower and upper boundaries for the collected histogram, and the number of buckets for the histogram to generate. By default these three numbers will be assumed to be 0 ([ns], lower bound), 1000000 ([ns], upper bound), and 20 (number of buckets to generate).

Note

The performance counters related to parcel coalescing are available only if the configuration time constant HPX_WITH_PARCEL_COALESCING is set to ON (default: ON). However, even in this case it will be available only for actions that are enabled for parcel coalescing (see the macros HPX_ACTION_USES_MESSAGE_COALESCING and HPX_ACTION_USES_MESSAGE_COALESCING_NOTHROW).

1

A message can potentially consist of more than one parcel.

APEX integration#

HPX provides integration with APEX, which is a framework for application profiling using task timers and various performance counters Huck et al.2. It can be added as a git submodule by turning on the option HPX_WITH_APEX:BOOL during CMake configuration. TAU is an optional dependency when using APEX.

To build HPX with APEX, add HPX_WITH_APEX=ON, and, optionally, Tau_ROOT=$PATH_TO_TAU to your CMake configuration. In addition, you can override the tag used for APEX with the HPX_WITH_APEX_TAG option. Please see the APEX HPX documentation for detailed instructions on using APEX with HPX.

References#
2

K. A. Huck, A. Porterfield, N Chaimov, H. Kaiser, A. D. Malony, T. Sterling, and R. Fowler. An autonomic performance environment for exascale. Supercomputing Frontiers and Innovations, 2015.

Using the LCI parcelport#

Basic information#

The Lightweight Communication Interface (LCI) is an ongoing research project aiming to provide efficient support for applications with irregular and asynchronous communication patterns such as graph analysis, sparse linear algebra, and task-based runtime on modern parallel architectures. Its features include (a) support for more communication primitives such as two-sided send/recv and one-sided (dynamic or direct) remote put/get (b) better multi-threaded performance (c) explicit user control of communication resource (d) flexible signaling mechanisms such as synchronizer, completion queue, and active message handler. It is designed to be a low-level communication library used by high-level libraries and frameworks.

The LCI parcelport is an experimental parcelport. It aims to provide the best possible communication performance on high-performance computation platforms. Compared to the MPI parcelport, it uses much fewer messages and memory copies to transfer an HPX parcel over the network. Its message transmission path involves minimum synchronization points and is almost lock-free. It is expected to be much faster than the MPI parcelport.

Build HPX with the LCI parcelport#

While building HPX, you can specify a set of CMake variables to enable and configure the LCI parcelport. Below, there is a set of the most important and frequently used CMake variables.

HPX_WITH_PARCELPORT_LCI#

Enable the LCI parcelport. This enables the use of LCI for networking operations in the HPX runtime. The default value is OFF because it’s not available on all systems and/or requires another dependency. However, this experimental parcelport may provide better performance than the MPI parcelport. You must set this variable to ON in order to use the LCI parcelport. All the following variables only make sense when this variable is set to ON.

HPX_WITH_FETCH_LCI#

Use FetchContent to fetch LCI. The default value is OFF. If this option is set to OFF. You need to install your own LCI library and HPX will try to find it using CMake find_package. You can specify the location of the LCI installation by the environmental variable LCI_ROOT. Refer to the LCI README for how to install LCI. If this option is set to ON. HPX will fetch and build LCI for you. You can use the following CMake variables to configure this behavior for your platform.

HPX_WITH_LCI_TAG#

This variable only takes effect when HPX_WITH_FETCH_LCI is set to ON and FETCHCONTENT_SOURCE_DIR_LCI is not set. HPX will fetch LCI from its github repository. This variable controls the branch/tag LCI will be fetched.

FETCHCONTENT_SOURCE_DIR_LCI#

This variable only takes effect when HPX_WITH_FETCH_LCI is set to ON. When it is defined, HPX_WITH_LCI_TAG will be ignored. It accepts a path to a local version of LCI source code and HPX will fetch and build LCI from there. The default value is set conservatively for the stability of HPX, but users are welcome to set this variable to master for potentially better performance.

Run HPX with the LCI parcelport#

We use the same mechanisms as MPI to launch LCI, so you can use the same way you run MPI parcelport to run LCI parcelport. Typically, it would be hpxrun.py, mpirun, or srun.

hpxrun.py serves as a wrapper for mpirun and srun. If you are using hpxrun.py, pass -p lci to the scripts. You also need to pass either -r mpi or -r srun to select the correct run wrapper according to the platform.

If you are using mpirun or srun, you can just pass --hpx:ini=hpx.parcel.lci.priority=1000, --hpx:ini=hpx.parcel.lci.enable=1, and --hpx:ini=hpx.parcel.bootstrap=lci to the HPX applications.

The hpxrun.py argument -r none (the default option for the run wrapper) and its corresponding HPX arguments --hpx:hpx and --hpx:agas do not work for the MPI or the LCI parcelport.

Performance tuning of the LCI parcelport#

We encourage users to set the following environmental variables when using the LCI parcelport to get better performance.

$ export LCI_SERVER_MAX_SENDS=1024
$ export LCI_SERVER_MAX_RECVS=4096
$ export LCI_SERVER_NUM_PKTS=65536
$ export LCI_SERVER_MAX_CQES=65536
$ export LCI_PACKET_SIZE=12288

This setting needs roughly 800MB memory per process. The memory consumption mainly comes from the packets, which can be calculated using LCI_SERVER_NUM_PKTS x LCI_PACKET_SIZE.

In addition, users can tune the following command-line options when using the LCI parcelport to get better performance.

--hpx:ini=hpx.parcel.lci.ndevices=<int>#

The number of LCI devices to use. The default value is 2. An LCI device represents a collection of network resources. More devices lead to lower thread contention, but too many devices may lead to load imbalance or hardware overhead.

--hpx:ini=hpx.parcel.lci.progress_type=<worker|rp>#

The way to progress the LCI device. The default value is worker. The worker option uses all worker threads to progress the LCI devices. The rp option uses dedicated pinned threads to progress the LCI devices. Normally, the worker option gives better performance, but the rp option has been observed with better performance on some clusters with prior generation of InfiniBand hardware.

HPX runtime and resources#

HPX thread scheduling policies#

The HPX runtime has six thread scheduling policies: local-priority, static-priority, local, static, local-workrequesting-fifo, and abp-priority. These policies can be specified from the command line using the command line option --hpx:queuing. In order to use a particular scheduling policy, the runtime system must be built with the appropriate scheduler flag turned on (e.g. cmake -DHPX_THREAD_SCHEDULERS=local, see CMake options for more information).

Priority local scheduling policy (default policy)#

The priority local scheduling policy maintains one queue per operating system (OS) thread. The OS thread pulls its work from this queue. By default the number of high priority queues is equal to the number of OS threads; the number of high priority queues can be specified on the command line using --hpx:high-priority-threads. High priority threads are executed by any of the OS threads before any other work is executed. When a queue is empty, work will be taken from high priority queues first. There is one low priority queue from which threads will be scheduled only when there is no other work.

For this scheduling policy there is an option to turn on NUMA sensitivity using the command line option --hpx:numa-sensitive. When NUMA sensitivity is turned on, work stealing is done from queues associated with the same NUMA domain first, only after that work is stolen from other NUMA domains.

This scheduler is enabled at build time by default using the FIFO (first-in-first-out) queueing policy. This policy can be invoked using --hpx:queuinglocal-priority-fifo. The scheduler can also be enabled using the LIFO (last-in-first-out) policy. This is not the default policy and must be invoked using the command line option --hpx:queuinglocal-priority-lifo.

Static priority scheduling policy#

The static scheduling policy maintains one queue per OS thread from which each OS thread pulls its tasks (user threads). Threads are distributed in a round robin fashion. There is no thread stealing in this policy.

Local scheduling policy#
  • invoke using: --hpx:queuinglocal (or -ql)

  • flag to turn on for build: HPX_THREAD_SCHEDULERS=all or HPX_THREAD_SCHEDULERS=local

The local scheduling policy maintains one queue per OS thread from which each OS thread pulls its tasks (user threads).

Static scheduling policy#
  • invoke using: --hpx:queuingstatic

  • flag to turn on for build: HPX_THREAD_SCHEDULERS=all or HPX_THREAD_SCHEDULERS=static

The static scheduling policy maintains one queue per OS thread from which each OS thread pulls its tasks (user threads). Threads are distributed in a round robin fashion. There is no thread stealing in this policy.

Priority ABP scheduling policy#
  • invoke using: --hpx:queuingabp-priority-fifo

  • flag to turn on for build: HPX_THREAD_SCHEDULERS=all or HPX_THREAD_SCHEDULERS=abp-priority

Priority ABP policy maintains a double ended lock free queue for each OS thread. By default the number of high priority queues is equal to the number of OS threads; the number of high priority queues can be specified on the command line using --hpx:high-priority-threads. High priority threads are executed by the first OS threads before any other work is executed. When a queue is empty work will be taken from high priority queues first. There is one low priority queue from which threads will be scheduled only when there is no other work. For this scheduling policy there is an option to turn on NUMA sensitivity using the command line option --hpx:numa-sensitive. When NUMA sensitivity is turned on work stealing is done from queues associated with the same NUMA domain first, only after that work is stolen from other NUMA domains.

This scheduler can be used with two underlying queuing policies (FIFO: first-in-first-out, and LIFO: last-in-first-out). In order to use the LIFO policy use the command line option --hpx:queuingabp-priority-lifo.

Work requesting scheduling policies#

The work-requesting policies rely on a different mechanism of balancing work between cores (compared to the other policies listed above). Instead of actively trying to steal work from other cores, requesting work relies on a less disruptive mechanism. If a core runs out of work, instead of actively looking at the queues of neighboring cores, in this case a request is posted to another core. This core now (whenever it is not busy with other work) either responds to the original core by sending back work or passes the request on to the next possible core in the system. In general, this scheme avoids contention on the work queues as those are always accessed by their own cores only.

The HPX resource partitioner#

The HPX resource partitioner lets you take the execution resources available on a system—processing units, cores, and numa domains—and assign them to thread pools. By default HPX creates a single thread pool name default. While this is good for most use cases, the resource partitioner lets you create multiple thread pools with custom resources and options.

Creating custom thread pools is useful for cases where you have tasks which absolutely need to run without interference from other tasks. An example of this is when using MPI for distribution instead of the built-in mechanisms in HPX (useful in legacy applications). In this case one can create a thread pool containing a single thread for MPI communication. MPI tasks will then always run on the same thread, instead of potentially being stuck in a queue behind other threads.

Note that HPX thread pools are completely independent from each other in the sense that task stealing will never happen between different thread pools. However, tasks running on a particular thread pool can schedule tasks on another thread pool.

Note

It is simpler in some situations to schedule important tasks with high priority instead of using a separate thread pool.

Using the resource partitioner#

The hpx::resource::partitioner is now created during HPX runtime initialization without explicit action needed from the user. To specify some of the initialization parameters you can use the hpx::init_params.

The resource partitioner callback is the interface to add thread pools to the HPX runtime and to assign resources to the thread pools. In order to create custom thread pools you can specify the resource partitioner callback hpx::init_params::rp_callback which will be called once the resource partitioner will be created , see the example below. You can also specify other parameters, see hpx::init_params.

To add a thread pool use the hpx::resource::partitioner::create_thread_pool method. If you simply want to use the default scheduler and scheduler options, it is enough to call rp.create_thread_pool("my-thread-pool").

Then, to add resources to the thread pool you can use the hpx::resource::partitioner::add_resource method. The resource partitioner exposes the hardware topology retrieved using Portable Hardware Locality (HWLOC) and lets you iterate through the topology to add the wanted processing units to the thread pool. Below is an example of adding all processing units from the first NUMA domain to a custom thread pool, unless there is only one NUMA domain in which case we leave the first processing unit for the default thread pool:

Note

Whatever processing units are not assigned to a thread pool by the time hpx::init is called will be added to the default thread pool. It is also possible to explicitly add processing units to the default thread pool, and to create the default thread pool manually (in order to e.g. set the scheduler type).

Tip

The command line option --hpx:print-bind is useful for checking that the thread pools have been set up the way you expect.

Difference between the old and new version#

In the old version, you had to create an instance of the resource_partitioner with argc and argv.

int main(int argc, char** argv)
{
    hpx::resource::partitioner rp(argc, argv);
    hpx::init();
}

From HPX 1.5.0 onwards, you just pass argc and argv to hpx::init() or hpx::start() for the binding options to be parsed by the resource partitioner.

int main(int argc, char** argv)
{
    hpx::init_params init_args;
    hpx::init(argc, argv, init_args);
}

In the old version, when creating a custom thread pool, you just called the utilities on the resource partitioner instantiated previously.

int main(int argc, char** argv)
{
    hpx::resource::partitioner rp(argc, argv);

    rp.create_thread_pool("my-thread-pool");

    bool one_numa_domain = rp.numa_domains().size() == 1;
    bool skipped_first_pu = false;

    hpx::resource::numa_domain const& d = rp.numa_domains()[0];

    for (const hpx::resource::core& c : d.cores())
    {
        for (const hpx::resource::pu& p : c.pus())
        {
            if (one_numa_domain && !skipped_first_pu)
            {
                skipped_first_pu = true;
                continue;
            }

            rp.add_resource(p, "my-thread-pool");
        }
    }

    hpx::init();
}

You now specify the resource partitioner callback which will tie the resources to the resource partitioner created during runtime initialization.

void init_resource_partitioner_handler(hpx::resource::partitioner& rp)
{
    rp.create_thread_pool("my-thread-pool");

    bool one_numa_domain = rp.numa_domains().size() == 1;
    bool skipped_first_pu = false;

    hpx::resource::numa_domain const& d = rp.numa_domains()[0];

    for (const hpx::resource::core& c : d.cores())
    {
        for (const hpx::resource::pu& p : c.pus())
        {
            if (one_numa_domain && !skipped_first_pu)
            {
                skipped_first_pu = true;
                continue;
            }

            rp.add_resource(p, "my-thread-pool");
        }
    }
}

int main(int argc, char* argv[])
{
    hpx::init_params init_args;
    init_args.rp_callback = &init_resource_partitioner_handler;

    hpx::init(argc, argv, init_args);
}
Advanced usage#

It is possible to customize the built in schedulers by passing scheduler options to hpx::resource::partitioner::create_thread_pool. It is also possible to create and use custom schedulers.

Note

It is not recommended to create your own scheduler. The HPX developers use this to experiment with new scheduler designs before making them available to users via the standard mechanisms of choosing a scheduler (command line options). If you would like to experiment with a custom scheduler the resource partitioner example shared_priority_queue_scheduler.cpp contains a fully implemented scheduler with logging, etc. to make exploration easier.

To choose a scheduler and custom mode for a thread pool, pass additional options when creating the thread pool like this:

rp.create_thread_pool("my-thread-pool",
    hpx::resource::policies::local_priority_lifo,
    hpx::policies::scheduler_mode(
        hpx::policies::scheduler_mode::default |
        hpx::policies::scheduler_mode::enable_elasticity));

The available schedulers are documented here: hpx::resource::scheduling_policy, and the available scheduler modes here: hpx::threads::policies::scheduler_mode. Also see the examples folder for examples of advanced resource partitioner usage: simple_resource_partitioner.cpp and oversubscribing_resource_partitioner.cpp.

Executors#

Executors in HPX provide a flexible way to control how, when, and where tasks are executed. Instead of manually creating threads or managing thread pools, you can hand your tasks to an executor, and it takes care of the details of running them.

This page introduces the concept of executors, the main types available in HPX, and how to create custom executors.

What is an executor?#

An executor is an abstraction that separates the what from the how of task execution:

  • What: the work to be performed (e.g. a function or task).

  • How: whether the task runs synchronously, asynchronously, or in parallel across multiple cores.

By using executors, you can switch between execution strategies without rewriting your algorithms. HPX provides a rich set of executors with a unified API.

Main Executors in HPX#

HPX provides multiple executors. Below are three of the most commonly used:

Parallel Executor#
  • Default in HPX.

  • Creates a new HPX thread for every scheduled task.

  • Works well for large tasks, but frequent small tasks incur overhead due to thread creation/destruction.

#include <hpx/execution.hpp>
#include <hpx/algorithm.hpp>
#include <vector>

std::vector<int> data(100, 1);

hpx::for_each(
    hpx::execution::par,
    data.begin(), data.end(),
    [](int &x) {
        x += 1;
    }
);
Fork-Join Executor#
  • Spawns one thread per CPU core when created.

  • Threads are reused across tasks, avoiding repeated creation costs.

  • Efficient for workloads with many sequential parallel regions.

  • May waste resources if parallel work is limited.

#include <hpx/execution.hpp>
#include <hpx/algorithm.hpp>
#include <vector>

std::vector<int> data(100, 1);

hpx::execution::experimental::fork_join_executor exec;

hpx::for_each(
    hpx::execution::par.on(exec),
    data.begin(), data.end(),
    [](int &x) {
        x += 1;
    }
);
Sequential Executor#
  • Executes all tasks synchronously on the calling thread.

  • Useful for debugging (deterministic execution) or when parallelism brings no benefit.

  • Maintains the same executor-based API while running tasks sequentially.

#include <hpx/execution.hpp>
#include <hpx/algorithm.hpp>
#include <vector>

std::vector<int> data(100, 1);

hpx::execution::sequenced_executor seq_exec;

hpx::for_each(
    hpx::execution::seq.on(seq_exec),
    data.begin(), data.end(),
    [](int &x) { x += 1; }
);
Executors in real-world applications#

A practical example is the LULESH mini-application (a hydrodynamics benchmark).

_images/lulesh_parallel.png

Fig. 9 Parallel executor: creates and destroys threads for each loop.#

_images/lulesh_fork_join.png

Fig. 10 Fork-join executor: reuses threads across loops.#

  • With the parallel executor, each parallel loop creates and destroys new threads, introducing overhead.

  • With the fork-join executor, threads are created once and reused across loops, reducing overhead and improving performance.

In studies, the fork-join executor achieved significant speedups, in some cases more than twice as fast as traditional OpenMP implementations.

_images/lulesh_speedup_comparison.png

Fig. 11 Performance comparison between executors in the LULESH benchmark.#

Custom executors#

While HPX provides a variety of built-in executors, you may sometimes need to adapt task execution to your own requirements. This is where custom executors come in. By writing a small wrapper around an existing executor, you can extend its behavior?for example, to add logging, profiling information, or special scheduling rules?while still taking advantage of the HPX executor API.

Custom annotating executor#

The following example shows how to implement a simple executor that annotates tasks with a string label for easier debugging and profiling.

Note

Annotations do not affect how tasks run or what results they produce. Their main purpose is to give human-readable names to tasks so that they can be identified in profilers, and debuggers.

Full example code#
#include <hpx/hpx_main.hpp>
#include <hpx/include/parallel_executors.hpp>
#include <hpx/include/async.hpp>
#include <hpx/execution.hpp>
#include <hpx/modules/async_base.hpp>
#include <hpx/modules/threading_base.hpp>

#include <iostream>
#include <string>
#include <utility>

template <typename BaseExecutor>
struct simple_annotating_executor
{
    BaseExecutor base_;
    const char* annotation_;

    simple_annotating_executor(BaseExecutor exec, const char* ann)
        : base_(std::move(exec)), annotation_(ann)
    {}

    // Non-blocking one-way executor
    template <typename F, typename... Ts>
    friend void tag_invoke(hpx::parallel::execution::post_t,
                        simple_annotating_executor const& exec,
                        F&& f, Ts&&... ts)
    {
        hpx::post(
            hpx::annotated_function(std::forward<F>(f), exec.annotation_),
            std::forward<Ts>(ts)...);
    }

    // Synchronous execution
    template <typename F, typename... Ts>
    friend auto tag_invoke(hpx::parallel::execution::sync_execute_t,
                        simple_annotating_executor const& exec,
                        F&& f, Ts&&... ts)
    {
        return hpx::parallel::execution::sync_execute(
            exec.base_,
            hpx::annotated_function(std::forward<F>(f), exec.annotation_),
            std::forward<Ts>(ts)...);
    }
};

// Example functions
int compute_square(int x)
{
    std::cout << "[sync_execute] Running task with annotation\n";
    return x * x;
}

void say_hello()
{
    std::cout << "[post] Running task with annotation\n";
}

int main()
{
    simple_annotating_executor exec(hpx::execution::parallel_executor{}, "my_custom_task");

    // Synchronous execution
    int result = hpx::parallel::execution::sync_execute(exec, &compute_square, 7);
    std::cout << "Result from sync_execute: " << result << "\n";

    // Post a task
    hpx::parallel::execution::post(exec, &say_hello);

    return 0;
}
Explanation#

The first lines pull in the necessary HPX headers for executors, asynchronous execution, and annotated functions. The key one here is hpx/threading_base/annotated_function.hpp, which provides the facility to tag tasks with a string label. We then define a simple_annotating_executor that wraps another executor and associates an annotation string with every task:

template <typename BaseExecutor>
struct simple_annotating_executor
{
    BaseExecutor base_;
    const char* annotation_;

    simple_annotating_executor(BaseExecutor exec, const char* ann)
        : base_(std::move(exec)), annotation_(ann)
    {}
};

The post customization schedules a task to run asynchronously. We wrap the task in hpx::annotated_function so that it carries the annotation. Executors in HPX customize these operations through tag_invoke overloads, which are selected by special tag objects like post_t` and sync_execute_t. This is why the executor interface may look different from a normal member function API.

template <typename F, typename... Ts>
friend void tag_invoke(hpx::parallel::execution::post_t,
                    simple_annotating_executor const& exec,
                    F&& f, Ts&&... ts)
{
    hpx::post(
        hpx::annotated_function(std::forward<F>(f), exec.annotation_),
        std::forward<Ts>(ts)...);
}

The sync_execute customization runs a task immediately and returns the result. Again, we wrap the function with an annotation before executing. The key difference is that post schedules a task in a fire-and-forget style (no result is returned), while sync_execute blocks until the task finishes and gives you the result back.

template <typename F, typename... Ts>
friend auto tag_invoke(hpx::parallel::execution::sync_execute_t,
                    simple_annotating_executor const& exec,
                    F&& f, Ts&&... ts)
{
    return hpx::parallel::execution::sync_execute(
        exec.base_,
        hpx::annotated_function(std::forward<F>(f), exec.annotation_),
        std::forward<Ts>(ts)...);
}

Note how we delegate the actual execution to exec.base_, the underlying executor. This makes the custom executor lightweight: it only adds annotations, while leaving the scheduling strategy to the base executor (here, a parallel_executor).

We define two simple functions to demonstrate both synchronous and asynchronous execution: These functions also print to std::cout, but this output is not the actual annotation. Annotations are stored internally by HPX and become visible when you use debugging or profiling tools.

int compute_square(int x)
{
    std::cout << "[sync_execute] Running task with annotation\n";
    return x * x;
}

void say_hello()
{
    std::cout << "[post] Running task with annotation\n";
}

Finally, in main we create the executor with a base executor and annotation string. We then run one task with sync_execute (blocking, returns result) and one with post (asynchronous, fire-and-forget). You can also create multiple annotating executors with different strings, so each task gets its own label. This is especially useful in larger applications with many different kinds of tasks, where annotations make it much easier to trace what is happening.

int main()
{
    simple_annotating_executor exec(hpx::execution::parallel_executor{}, "my_custom_task");

    // Synchronous execution
    int result = hpx::parallel::execution::sync_execute(exec, &compute_square, 7);
    std::cout << "Result from sync_execute: " << result << "\n";

    // Post a task
    hpx::parallel::execution::post(exec, &say_hello);

    return 0;
}
Custom annotating executor with parallel algorithms#

The following example demonstrates how to use a custom annotating executor with HPX parallel algorithms, such as for_each. This allows you to attach annotations to tasks while executing them in parallel.

Full example code#
#include <hpx/execution.hpp>
#include <hpx/hpx_main.hpp>
#include <hpx/include/parallel_algorithm.hpp>
#include <hpx/include/parallel_executors.hpp>
#include <hpx/modules/threading_base.hpp>

#include <iostream>
#include <utility>
#include <vector>

template <typename BaseExecutor>
struct simple_annotating_executor
{
    BaseExecutor base_;
    char const* annotation_;

    using execution_category =
        hpx::traits::executor_execution_category_t<BaseExecutor>;

    simple_annotating_executor(BaseExecutor exec, char const* ann)
    : base_(std::move(exec))
    , annotation_(ann)
    {
    }

    // Bulk async_execute (used by parallel algorithms)
    template <typename F, typename Shape, typename... Ts>
    friend auto tag_invoke(hpx::parallel::execution::bulk_async_execute_t,
        simple_annotating_executor const& exec, F&& f, Shape const& shape,
        Ts&&... ts)
    {
        return hpx::parallel::execution::bulk_async_execute(
            exec.base_,
            hpx::annotated_function(std::forward<F>(f), exec.annotation_),
            shape, std::forward<Ts>(ts)...);
    }
};

namespace hpx::execution::experimental {

    // The annotating executor exposes the same executor categories as its
    // underlying (wrapped) executor.

    template <typename BaseExecutor>
    struct is_never_blocking_one_way_executor<
        simple_annotating_executor<BaseExecutor>>
    : is_never_blocking_one_way_executor<BaseExecutor>
    {
    };

    template <typename BaseExecutor>
    struct is_one_way_executor<simple_annotating_executor<BaseExecutor>>
    : is_one_way_executor<BaseExecutor>
    {
    };

    template <typename BaseExecutor>
    struct is_two_way_executor<simple_annotating_executor<BaseExecutor>>
    : is_two_way_executor<BaseExecutor>
    {
    };
}    // namespace hpx::execution::experimental

int main()
{
    using base_executor = hpx::execution::parallel_executor;
    simple_annotating_executor exec(base_executor{}, "for_each_task");

    std::vector<int> data = {1, 2, 3, 4, 5};

    // Use the custom executor with a parallel algorithm
    hpx::for_each(hpx::execution::par.on(exec),    // attach executor
        data.begin(), data.end(), [](int& x) {
            std::cout << "Processing " << x << " on thread "
                    << hpx::get_worker_thread_num() << "\n";
            x *= x;
        });

    std::cout << "Squared values: ";
    for (int v : data)
        std::cout << v << " ";
    std::cout << "\n";

    return 0;
}
Explanation#

Similar as before, the first lines pull in the necessary HPX headers for executors, asynchronous execution, and annotated functions. The key one here is hpx/threading_base/annotated_function.hpp, which provides the facility to tag tasks with a string label. We then define a simple_annotating_executor that wraps another executor and associates an annotation string with every task:

template <typename BaseExecutor>
struct simple_annotating_executor
{
    BaseExecutor base_;
    const char* annotation_;

    using execution_category =
        hpx::traits::executor_execution_category_t<BaseExecutor>;

    simple_annotating_executor(BaseExecutor exec, const char* ann)
        : base_(std::move(exec)), annotation_(ann)
    {}
};

Note that we expose the execution category of the custom executor with using execution_category = hpx::traits::executor_execution_category_t<BaseExecutor>. We inherit the execution category from the underlying executor (BaseExecutor`), which ensures that our simple_annotating_executor behaves like the base executor in terms of parallelism and task execution capabilities.

The bulk_async_execute customization schedules a set of tasks to run asynchronously. We wrap each task in hpx::annotated_function so that it carries the annotation. Executors in HPX customize these operations through tag_invoke overloads, which are selected by special tag objects like bulk_async_execute_t. This is why the executor interface may look different from a normal member function API - it uses tag dispatch to integrate seamlessly with the HPX parallel algorithms infrastructure.

template <typename F, typename Shape, typename... Ts>
friend auto tag_invoke(hpx::parallel::execution::bulk_async_execute_t,
    simple_annotating_executor const& exec, F&& f, Shape const& shape,
    Ts&&... ts)
{
    return hpx::parallel::execution::bulk_async_execute(
        exec.base_,
        hpx::annotated_function(std::forward<F>(f), exec.annotation_),
        shape, std::forward<Ts>(ts)...);
}

Note how we delegate the actual execution to exec.base_, the underlying executor. This makes the custom executor lightweight: it only adds annotations, while leaving the scheduling strategy to the base executor (here, a parallel_executor).

The hpx::execution::experimental namespace contains traits that describe executor capabilities, such as whether an executor can run tasks one-way, two-way, or never-blocking. These traits are used internally by HPX to verify that an executor is compatible with a given parallel algorithm or execution policy.

namespace hpx::execution::experimental {

    // The annotating executor exposes the same executor categories as its
    // underlying (wrapped) executor.

    template <typename BaseExecutor>
    struct is_never_blocking_one_way_executor<
        simple_annotating_executor<BaseExecutor>>
    : is_never_blocking_one_way_executor<BaseExecutor>
    {
    };

    template <typename BaseExecutor>
    struct is_one_way_executor<simple_annotating_executor<BaseExecutor>>
    : is_one_way_executor<BaseExecutor>
    {
    };

    template <typename BaseExecutor>
    struct is_two_way_executor<simple_annotating_executor<BaseExecutor>>
    : is_two_way_executor<BaseExecutor>
    {
    };
}
  • is_never_blocking_one_way_executor indicates whether the executor can schedule tasks in a fire-and-forget style without blocking.

  • is_one_way_executor indicates support for one-way execution (tasks can be scheduled but no result is returned).

  • is_two_way_executor indicates support for two-way execution (tasks return a result or a future).

In all cases, the custom executor inherits the capabilities of the base executor, so it integrates seamlessly with HPX algorithms.

This design ensures that simple_annotating_executor can be used anywhere its underlying executor could be used, while still adding the annotation functionality. It keeps the custom executor lightweight and fully compatible with the parallel algorithms infrastructure.

In main, we create the executor with a base executor and an annotation string, and then use it with a parallel algorithm:

int main()
{
    using base_executor = hpx::execution::parallel_executor;
    simple_annotating_executor exec(base_executor{}, "for_each_task");

    std::vector<int> data = {1, 2, 3, 4, 5};

    // Use the custom executor with a parallel algorithm
    hpx::for_each(hpx::execution::par.on(exec),    // attach executor
        data.begin(), data.end(), [](int& x) {
            std::cout << "Processing " << x << " on thread "
                    << hpx::get_worker_thread_num() << "\n";
            x *= x;
        });

    std::cout << "Squared values: ";
    for (int v : data)
        std::cout << v << " ";
    std::cout << "\n";

    return 0;
}

First, we create a base executor (parallel_executor) and wrap it in our simple_annotating_executor, providing an annotation string “for_each_task”. This custom executor will attach the annotation to every task it schedules, while delegating actual execution to the base executor.

We then use hpx::for_each with a parallel execution policy and attach our custom executor using par.on(exec): * hpx::execution::par.on(exec) attaches our custom executor to the algorithm. * for_each internally partitions the work across threads and schedules each task using bulk_async_execute. * Each task is annotated with “for_each_task”, visible in debuggers and profilers. * The results of the parallel computation are stored in the data vector, demonstrating that the algorithm

executed successfully in parallel.

This pattern is especially useful in larger applications with many tasks, as annotations make it much easier to trace and debug the execution of parallel algorithms.

Miscellaneous#

Error handling#

Like in any other asynchronous invocation scheme, it is important to be able to handle error conditions occurring while the asynchronous (and possibly remote) operation is executed. In HPX all error handling is based on standard C++ exception handling. Any exception thrown during the execution of an asynchronous operation will be transferred back to the original invocation locality, where it will be rethrown during synchronization with the calling thread.

The source code for this example can be found here: error_handling.cpp.

Working with exceptions#

For the following description assume that the function raise_exception() is executed by invoking the plain action raise_exception_type.

void raise_exception()
{
    HPX_THROW_EXCEPTION(
        hpx::error::no_success, "raise_exception", "simulated error");
}
HPX_PLAIN_ACTION(raise_exception, raise_exception_action)

The exception is thrown using the macro HPX_THROW_EXCEPTION. The type of the thrown exception is hpx::exception. This associates additional diagnostic information with the exception, such as file name and line number, locality id and thread id, and stack backtrace from the point where the exception was thrown.

Any exception thrown during the execution of an action is transferred back to the (asynchronous) invocation site. It will be rethrown in this context when the calling thread tries to wait for the result of the action by invoking either future<>::get() or the synchronous action invocation wrapper as shown here:

Note

The exception is transferred back to the invocation site even if it is executed on a different locality.

Additionally, this example demonstrates how an exception thrown by an (possibly remote) action can be handled. It shows the use of hpx::diagnostic_information, which retrieves all available diagnostic information from the exception as a formatted string. This includes, for instance, the name of the source file and line number, the sequence number of the OS thread and the HPX thread id, the locality id and the stack backtrace of the point where the original exception was thrown.

Under certain circumstances it is desirable to output only some of the diagnostics, or to output those using different formatting. For this case, HPX exposes a set of lower-level functions as demonstrated in the following code snippet:

Working with error codes#

Most of the API functions exposed by HPX can be invoked in two different modes. By default those will throw an exception on error as described above. However, sometimes it is desirable not to throw an exception in case of an error condition. In this case an object instance of the hpx::error_code type can be passed as the last argument to the API function. In case of an error, the error condition will be returned in that hpx::error_code instance. The following example demonstrates extracting the full diagnostic information without exception handling:

            hpx::cout << "Error reporting using error code\n";

            // Create a new error_code instance.
            hpx::error_code ec;

            // If an instance of an error_code is passed as the last argument while
            // invoking the action, the function will not throw in case of an error
            // but store the error information in this error_code instance instead.
            raise_exception_action do_it;
            do_it(hpx::find_here(), ec);

            if (ec)
            {
                // Print just the essential error information.
                hpx::cout << "returned error: " << ec.get_message() << "\n";

                // Print all of the available diagnostic information as stored with
                // the exception.
                hpx::cout << "diagnostic information:"
                          << hpx::diagnostic_information(ec) << "\n";
            }

            hpx::cout << std::flush;

Note

The error information is transferred back to the invocation site even if it is executed on a different locality.

This example show how an error can be handled without having to resolve to exceptions and that the returned hpx::error_code instance can be used in a very similar way as the hpx::exception type above. Simply pass it to the hpx::diagnostic_information, which retrieves all available diagnostic information from the error code instance as a formatted string.

As for handling exceptions, when working with error codes, under certain circumstances it is desirable to output only some of the diagnostics, or to output those using different formatting. For this case, HPX exposes a set of lower-level functions usable with error codes as demonstrated in the following code snippet:

            hpx::cout << "Detailed error reporting using error code\n";

            // Create a new error_code instance.
            hpx::error_code ec;

            // If an instance of an error_code is passed as the last argument while
            // invoking the action, the function will not throw in case of an error
            // but store the error information in this error_code instance instead.
            raise_exception_action do_it;
            do_it(hpx::find_here(), ec);

            if (ec)
            {
                // Print the elements of the diagnostic information separately.
                hpx::cout << "{what}: " << hpx::get_error_what(ec) << "\n";
                hpx::cout << "{locality-id}: " << hpx::get_error_locality_id(ec)
                          << "\n";
                hpx::cout << "{hostname}: " << hpx::get_error_host_name(ec)
                          << "\n";
                hpx::cout << "{pid}: " << hpx::get_error_process_id(ec) << "\n";
                hpx::cout << "{function}: " << hpx::get_error_function_name(ec)
                          << "\n";
                hpx::cout << "{file}: " << hpx::get_error_file_name(ec) << "\n";
                hpx::cout << "{line}: " << hpx::get_error_line_number(ec)
                          << "\n";
                hpx::cout << "{os-thread}: " << hpx::get_error_os_thread(ec)
                          << "\n";
                hpx::cout << "{thread-id}: " << std::hex
                          << hpx::get_error_thread_id(ec) << "\n";
                hpx::cout << "{thread-description}: "
                          << hpx::get_error_thread_description(ec) << "\n\n";
                hpx::cout << "{state}: " << std::hex << hpx::get_error_state(ec)
                          << "\n";
                hpx::cout << "{stack-trace}: " << hpx::get_error_backtrace(ec)
                          << "\n";
                hpx::cout << "{env}: " << hpx::get_error_env(ec) << "\n";
            }

            hpx::cout << std::flush;

For more information please refer to the documentation of hpx::get_error_what, hpx::get_error_locality_id, hpx::get_error_host_name, hpx::get_error_process_id, hpx::get_error_function_name, hpx::get_error_file_name, hpx::get_error_line_number, hpx::get_error_os_thread, hpx::get_error_thread_id, hpx::get_error_thread_description, hpx::get_error_backtrace, hpx::get_error_env, and hpx::get_error_state.

Lightweight error codes#

Sometimes it is not desirable to collect all the ambient information about the error at the point where it happened as this might impose too much overhead for simple scenarios. In this case, HPX provides a lightweight error code facility that will hold the error code only. The following snippet demonstrates its use:

            hpx::cout << "Error reporting using an lightweight error code\n";

            // Create a new error_code instance.
            hpx::error_code ec(hpx::throwmode::lightweight);

            // If an instance of an error_code is passed as the last argument while
            // invoking the action, the function will not throw in case of an error
            // but store the error information in this error_code instance instead.
            raise_exception_action do_it;
            do_it(hpx::find_here(), ec);

            if (ec)
            {
                // Print just the essential error information.
                hpx::cout << "returned error: " << ec.get_message() << "\n";

                // Print all of the available diagnostic information as stored with
                // the exception.
                hpx::cout << "error code:" << ec.value() << "\n";
            }

            hpx::cout << std::flush;

All functions that retrieve other diagnostic elements from the hpx::error_code will fail if called with a lightweight error_code instance.

Utilities in HPX#

In order to ease the burden of programming, HPX provides several utilities to users. The following section documents those facilies.

Checkpoint#

See checkpoint.

The HPX I/O-streams component#

The HPX I/O-streams subsystem extends the standard C++ output streams std::cout and std::cerr to work in the distributed setting of an HPX application. All of the output streamed to hpx::cout will be dispatched to std::cout on the console locality. Likewise, all output generated from hpx::cerr will be dispatched to std::cerr on the console locality.

Note

All existing standard manipulators can be used in conjunction with hpx::cout and hpx::cerr.

In order to use either hpx::cout or hpx::cerr, application codes need to #include <hpx/include/iostreams.hpp>. For an example, please see the following ‘Hello world’ program:

// Including 'hpx/hpx_main.hpp' instead of the usual 'hpx/hpx_init.hpp' enables
// to use the plain C-main below as the direct main HPX entry point.
#include <hpx/hpx_main.hpp>
#include <hpx/iostream.hpp>

int main()
{
    // Say hello to the world!
    hpx::cout << "Hello World!\n" << std::flush;
    return 0;
}

Additionally, those applications need to link with the iostreams component. When using CMake this can be achieved by using the COMPONENT_DEPENDENCIES parameter; for instance:

include(HPX_AddExecutable)

add_hpx_executable(
    hello_world
    SOURCES hello_world.cpp
    COMPONENT_DEPENDENCIES iostreams
)

Note

The hpx::cout and hpx::cerr streams buffer all output locally until a std::endl or std::flush is encountered. That means that no output will appear on the console as long as either of these is explicitly used.

Troubleshooting#

Common issues#

This section contains commonly encountered problems when compiling or using HPX.

See also the closed issues on GitHub to find out how other people resolved a similar problem. If nothing of that works, you can also open a new issue on GitHub or contact us using one the options found in Support for deploying and using HPX.

HPX::iostreams_component" target not found#

You may see a CMake error message that looks a bit like this:

error: `HPX::iostreams_component`` target not found

Simply ensure that HPX is installed with HPX_WITH_DISTRIBUTED_RUNTIME=ON to prevent encountering such error(s). This is required if you want to use hpx::cout.

Undefined reference to hpx::cout#

You may see a linker error message that looks a bit like this:

hello_world.cpp:(.text+0x5aa): undefined reference to `hpx::cout'

This usually happens if you are trying to use HPX iostreams functionality such as hpx::cout but are not linking against it. The iostreams functionality is not part of the core HPX library, and must be linked to explicitly. Typically this can be solved by adding COMPONENT_DEPENDENCIES iostreams to a call to add_hpx_library/add_hpx_executable/hpx_setup_target if using CMake. See Creating HPX projects for more details.

Build fails with ASIO error#

You may see an error message that looks a bit like this:

Cannot open include file asio/io_context.hpp

This can be resolved by using -DHPX_WITH_FETCH_ASIO=ON to the cmake command line.

See also the corresponding closed Issue #5404 for more information.

Build fails with TCMalloc error#

You may see an error message that looks a bit like this:

Could NOT find TCMalloc (missing: Tcmalloc_LIBRARY Tcmalloc_INCLUDE_DIR)
ERROR: HPX_WITH_MALLOC was set to tcmalloc, but tcmalloc could not be
found.  Valid options for HPX_WITH_MALLOC are: system, tcmalloc, jemalloc,
mimalloc, tbbmalloc, and custom

This can be resolved either by defining HPX_WITH_MALLOC=system or by installing TCMalloc. This error occurs when users don’t specify an option for HPX_WITH_MALLOC; in that case, HPX will be looking tcmalloc, which is the default value.

Useful suggestions#
Reducing compilation time#

If you want to significantly reduce compilation time, you can just use the local part of HPX for parallelism by disabling the distributed functionality. Moreover, you can avoid compiling examples. These can be done with the following flags:

-DHPX_WITH_NETWORKING=OFF
-DHPX_WITH_DISTRIBUTED_RUNTIME=OFF
-DHPX_WITH_EXAMPLES=OFF
-DHPX_WITH_TESTS=OFF
Linking HPX to your application#

If you want to avoid installing and linking HPX, you can just build HPX and then use the following flag on your HPX application CMake configuration:

-DHPX_DIR=<build_dir>/lib/cmake/HPX

Note

For this to work you need not to specify -DCMAKE_INSTALL_PREFIX when building HPX.

HPX-application build type conformance#

Your application’s build type should align with the HPX build type. For example, if you specified -DCMAKE_BUILD_TYPE=Debug during the HPX compilation, then your application needs to be compiled with the same flag. We recommend keeping a separate build folder for different build types and just point accordingly to the type you want by using -DHPX_DIR=<build_dir>/lib/cmake/HPX.

Terminology#

This section gives definitions for some of the terms used throughout the HPX documentation and source code.

Locality#

A locality in HPX describes a synchronous domain of execution, or the domain of bounded upper response time. This normally is just a single node in a cluster or a NUMA domain in a SMP machine.

Active Global Address Space#
AGAS#

HPX incorporates a global address space. Any executing thread can access any object within the domain of the parallel application with the caveat that it must have appropriate access privileges. The model does not assume that global addresses are cache coherent; all loads and stores will deal directly with the site of the target object. All global addresses within a Synchronous Domain are assumed to be cache coherent for those processor cores that incorporate transparent caches. The Active Global Address Space used by HPX differs from research PGAS models. Partitioned Global Address Space is passive in their means of address translation. Copy semantics, distributed compound operations, and affinity relationships are some of the global functionality supported by AGAS.

Process#

The concept of the “process” in HPX is extended beyond that of either sequential execution or communicating sequential processes. While the notion of process suggests action (as do “function” or “subroutine”) it has a further responsibility of context, that is, the logical container of program state. It is this aspect of operation that process is employed in HPX. Furthermore, referring to “parallel processes” in HPX designates the presence of parallelism within the context of a given process, as well as the coarse grained parallelism achieved through concurrency of multiple processes of an executing user job. HPX processes provide a hierarchical name space within the framework of the active global address space and support multiple means of internal state access from external sources.

Parcel#

The Parcel is a component in HPX that communicates data, invokes an action at a distance, and distributes flow-control through the migration of continuations. Parcels bridge the gap of asynchrony between synchronous domains while maintaining symmetry of semantics between local and global execution. Parcels enable message-driven computation and may be seen as a form of “active messages”. Other important forms of message-driven computation predating active messages include dataflow tokens, the J-machine’s support for remote method instantiation, and at the coarse grained variations of Unix remote procedure calls, among others. This enables work to be moved to the data as well as performing the more common action of bringing data to the work. A parcel can cause actions to occur remotely and asynchronously, among which are the creation of threads at different system nodes or synchronous domains.

Local Control Object#
Lightweight Control Object#
LCO#

A local control object (sometimes called a lightweight control object) is a general term for the synchronization mechanisms used in HPX. Any object implementing a certain concept can be seen as an LCO. This concepts encapsulates the ability to be triggered by one or more events which when taking the object into a predefined state will cause a thread to be executed. This could either create a new thread or resume an existing thread.

The LCO is a family of synchronization functions potentially representing many classes of synchronization constructs, each with many possible variations and multiple instances. The LCO is sufficiently general that it can subsume the functionality of conventional synchronization primitives such as spinlocks, mutexes, semaphores, and global barriers. However due to the rich concept an LCO can represent powerful synchronization and control functionality not widely employed, such as dataflow and futures (among others), which open up enormous opportunities for rich diversity of distributed control and operation.

See lcos for more details on how to use LCOs in HPX.

Action#

An action is a function that can be invoked remotely. In HPX a plain function can be made into an action using a macro. See applying_actions for details on how to use actions in HPX.

Component#

A component is a C++ object which can be accessed remotely. A component can also contain member functions which can be invoked remotely. These are referred to as component actions. See Writing components for details on how to use components in HPX.

Why HPX?#

Current advances in high performance computing (HPC) continue to suffer from the issues plaguing parallel computation. These issues include, but are not limited to, ease of programming, inability to handle dynamically changing workloads, scalability, and efficient utilization of system resources. Emerging technological trends such as multi-core processors further highlight limitations of existing parallel computation models. To mitigate the aforementioned problems, it is necessary to rethink the approach to parallelization models. ParalleX contains mechanisms such as multi-threading, parcels, global name space support, percolation and local control objects (LCO). By design, ParalleX overcomes limitations of current models of parallelism by alleviating contention, latency, overhead and starvation. With ParalleX, it is further possible to increase performance by at least an order of magnitude on challenging parallel algorithms, e.g., dynamic directed graph algorithms and adaptive mesh refinement methods for astrophysics. An additional benefit of ParalleX is fine-grained control of power usage, enabling reductions in power consumption.

ParalleX—a new execution model for future architectures#

ParalleX is a new parallel execution model that offers an alternative to the conventional computation models, such as message passing. ParalleX distinguishes itself by:

  • Split-phase transaction model

  • Message-driven

  • Distributed shared memory (not cache coherent)

  • Multi-threaded

  • Futures synchronization

  • Local Control Objects (LCOs)

  • Synchronization for anonymous producer-consumer scenarios

  • Percolation (pre-staging of task data)

The ParalleX model is intrinsically latency hiding, delivering an abundance of variable-grained parallelism within a hierarchical namespace environment. The goal of this innovative strategy is to enable future systems delivering very high efficiency, increased scalability and ease of programming. ParalleX can contribute to significant improvements in the design of all levels of computing systems and their usage from application algorithms and their programming languages to system architecture and hardware design together with their supporting compilers and operating system software.

What is HPX?#

High Performance ParalleX (HPX) is the first runtime system implementation of the ParalleX execution model. The HPX runtime software package is a modular, feature-complete, and performance-oriented representation of the ParalleX execution model targeted at conventional parallel computing architectures, such as SMP nodes and commodity clusters. It is academically developed and freely available under an open source license. We provide HPX to the community for experimentation and application to achieve high efficiency and scalability for dynamic adaptive and irregular computational problems. HPX is a C++ library that supports a set of critical mechanisms for dynamic adaptive resource management and lightweight task scheduling within the context of a global address space. It is solidly based on many years of experience in writing highly parallel applications for HPC systems.

The two-decade success of the communicating sequential processes (CSP) execution model and its message passing interface (MPI) programming model have been seriously eroded by challenges of power, processor core complexity, multi-core sockets, and heterogeneous structures of GPUs. Both efficiency and scalability for some current (strong scaled) applications and future Exascale applications demand new techniques to expose new sources of algorithm parallelism and exploit unused resources through adaptive use of runtime information.

The ParalleX execution model replaces CSP to provide a new computing paradigm embodying the governing principles for organizing and conducting highly efficient scalable computations greatly exceeding the capabilities of today’s problems. HPX is the first practical, reliable, and performance-oriented runtime system incorporating the principal concepts of the ParalleX model publicly provided in open source release form.

HPX is designed by the STE||AR Group (Systems Technology, Emergent Parallelism, and Algorithm Research) at Louisiana State University (LSU)’s Center for Computation and Technology (CCT) to enable developers to exploit the full processing power of many-core systems with an unprecedented degree of parallelism. STE||AR is a research group focusing on system software solutions and scientific application development for hybrid and many-core hardware architectures.

For more information about the STE||AR Group, see People.

What makes our systems slow?#

Estimates say that we currently run our computers at well below 100% efficiency. The theoretical peak performance (usually measured in FLOPS—floating point operations per second) is much higher than any practical peak performance reached by any application. This is particularly true for highly parallel hardware. The more hardware parallelism we provide to an application, the better the application must scale in order to efficiently use all the resources of the machine. Roughly speaking, we distinguish two forms of scalability: strong scaling (see Amdahl’s Law) and weak scaling (see Gustafson’s Law). Strong scaling is defined as how the solution time varies with the number of processors for a fixed total problem size. It gives an estimate of how much faster we can solve a particular problem by throwing more resources at it. Weak scaling is defined as how the solution time varies with the number of processors for a fixed problem size per processor. In other words, it defines how much more data can we process by using more hardware resources.

In order to utilize as much hardware parallelism as possible an application must exhibit excellent strong and weak scaling characteristics, which requires a high percentage of work executed in parallel, i.e., using multiple threads of execution. Optimally, if you execute an application on a hardware resource with N processors it either runs N times faster or it can handle N times more data. Both cases imply 100% of the work is executed on all available processors in parallel. However, this is just a theoretical limit. Unfortunately, there are more things that limit scalability, mostly inherent to the hardware architectures and the programming models we use. We break these limitations into four fundamental factors that make our systems SLOW:

  • Starvation occurs when there is insufficient concurrent work available to maintain high utilization of all resources.

  • Latencies are imposed by the time-distance delay intrinsic to accessing remote resources and services.

  • Overhead is work required for the management of parallel actions and resources on the critical execution path, which is not necessary in a sequential variant.

  • Waiting for contention resolution is the delay due to the lack of availability of oversubscribed shared resources.

Each of those four factors manifests itself in multiple and different ways; each of the hardware architectures and programming models expose specific forms. However, the interesting part is that all of them are limiting the scalability of applications no matter what part of the hardware jungle we look at. Hand-helds, PCs, supercomputers, or the cloud, all suffer from the reign of the 4 horsemen: Starvation, Latency, Overhead, and Contention. This realization is very important as it allows us to derive the criteria for solutions to the scalability problem from first principles, and it allows us to focus our analysis on very concrete patterns and measurable metrics. Moreover, any derived results will be applicable to a wide variety of targets.

Technology demands new response#

Today’s computer systems are designed based on the initial ideas of John von Neumann, as published back in 1945, and later extended by the Harvard architecture. These ideas form the foundation, the execution model, of computer systems we use currently. However, a new response is required in the light of the demands created by today’s technology.

So, what are the overarching objectives for designing systems allowing for applications to scale as they should? In our opinion, the main objectives are:

  • Performance: as previously mentioned, scalability and efficiency are the main criteria people are interested in.

  • Fault tolerance: the low expected mean time between failures (MTBF) of future systems requires embracing faults, not trying to avoid them.

  • Power: minimizing energy consumption is a must as it is one of the major cost factors today, and will continue to rise in the future.

  • Generality: any system should be usable for a broad set of use cases.

  • Programmability: for programmer this is a very important objective, ensuring long term platform stability and portability.

What needs to be done to meet those objectives, to make applications scale better on tomorrow’s architectures? Well, the answer is almost obvious: we need to devise a new execution model—a set of governing principles for the holistic design of future systems—targeted at minimizing the effect of the outlined SLOW factors. Everything we create for future systems, every design decision we make, every criteria we apply, have to be validated against this single, uniform metric. This includes changes in the hardware architecture we prevalently use today, and it certainly involves new ways of writing software, starting from the operating system, runtime system, compilers, and at the application level. However, the key point is that all those layers have to be co-designed; they are interdependent and cannot be seen as separate facets. The systems we have today have been evolving for over 50 years now. All layers function in a certain way, relying on the other layers to do so. But we do not have the time to wait another 50 years for a new coherent system to evolve. The new paradigms are needed now—therefore, co-design is the key.

Governing principles applied while developing HPX#

As it turn out, we do not have to start from scratch. Not everything has to be invented and designed anew. Many of the ideas needed to combat the 4 horsemen already exist, many for more than 30 years. All it takes is to gather them into a coherent approach. We’ll highlight some of the derived principles we think to be crucial for defeating SLOW. Some of those are focused on high-performance computing, others are more general.

Focus on latency hiding instead of latency avoidance#

It is impossible to design a system exposing zero latencies. In an effort to come as close as possible to this goal many optimizations are mainly targeted towards minimizing latencies. Examples for this can be seen everywhere, such as low latency network technologies like InfiniBand, caching memory hierarchies in all modern processors, the constant optimization of existing MPI implementations to reduce related latencies, or the data transfer latencies intrinsic to the way we use GPGPUs today. It is important to note that existing latencies are often tightly related to some resource having to wait for the operation to be completed. At the same time it would be perfectly fine to do some other, unrelated work in the meantime, allowing the system to hide the latencies by filling the idle-time with useful work. Modern systems already employ similar techniques (pipelined instruction execution in the processor cores, asynchronous input/output operations, and many more). What we propose is to go beyond anything we know today and to make latency hiding an intrinsic concept of the operation of the whole system stack.

Embrace fine-grained parallelism instead of heavyweight threads#

If we plan to hide latencies even for very short operations, such as fetching the contents of a memory cell from main memory (if it is not already cached), we need to have very lightweight threads with extremely short context switching times, optimally executable within one cycle. Granted, for mainstream architectures, this is not possible today (even if we already have special machines supporting this mode of operation, such as the Cray XMT). For conventional systems, however, the smaller the overhead of a context switch and the finer the granularity of the threading system, the better will be the overall system utilization and its efficiency. For today’s architectures we already see a flurry of libraries providing exactly this type of functionality: non-pre-emptive, task-queue based parallelization solutions, such as Intel Threading Building Blocks (TBB), Microsoft Parallel Patterns Library (PPL), Cilk++, and many others. The possibility to suspend a current task if some preconditions for its execution are not met (such as waiting for I/O or the result of a different task), seamlessly switching to any other task which can continue, and to reschedule the initial task after the required result has been calculated, which makes the implementation of latency hiding almost trivial.

Rediscover constraint-based synchronization to replace global barriers#

The code we write today is riddled with implicit (and explicit) global barriers. By “global barriers,” we mean the synchronization of the control flow between several (very often all) threads (when using OpenMP) or processes (MPI). For instance, an implicit global barrier is inserted after each loop parallelized using OpenMP as the system synchronizes the threads used to execute the different iterations in parallel. In MPI each of the communication steps imposes an explicit barrier onto the execution flow as (often all) nodes have to be synchronized. Each of those barriers is like the eye of a needle the overall execution is forced to be squeezed through. Even minimal fluctuations in the execution times of the parallel threads (jobs) causes them to wait. Additionally, it is often only one of the executing threads that performs the actual reduce operation, which further impedes parallelism. A closer analysis of a couple of key algorithms used in science applications reveals that these global barriers are not always necessary. In many cases it is sufficient to synchronize a small subset of the threads. Any operation should proceed whenever the preconditions for its execution are met, and only those. Usually there is no need to wait for iterations of a loop to finish before you can continue calculating other things; all you need is to complete the iterations that produce the required results for the next operation. Good bye global barriers, hello constraint based synchronization! People have been trying to build this type of computing (and even computers) since the 1970s. The theory behind what they did is based on ideas around static and dynamic dataflow. There are certain attempts today to get back to those ideas and to incorporate them with modern architectures. For instance, a lot of work is being done in the area of constructing dataflow-oriented execution trees. Our results show that employing dataflow techniques in combination with the other ideas, as outlined herein, considerably improves scalability for many problems.

Adaptive locality control instead of static data distribution#

While this principle seems to be a given for single desktop or laptop computers (the operating system is your friend), it is everything but ubiquitous on modern supercomputers, which are usually built from a large number of separate nodes (i.e., Beowulf clusters), tightly interconnected by a high-bandwidth, low-latency network. Today’s prevalent programming model for those is MPI, which does not directly help with proper data distribution, leaving it to the programmer to decompose the data to all of the nodes the application is running on. There are a couple of specialized languages and programming environments based on PGAS (Partitioned Global Address Space) designed to overcome this limitation, such as Chapel, X10, UPC, or Fortress. However, all systems based on PGAS rely on static data distribution. This works fine as long as this static data distribution does not result in heterogeneous workload distributions or other resource utilization imbalances. In a distributed system these imbalances can be mitigated by migrating part of the application data to different localities (nodes). The only framework supporting (limited) migration today is Charm++. The first attempts towards solving related problem go back decades as well, a good example is the Linda coordination language. Nevertheless, none of the other mentioned systems support data migration today, which forces the users to either rely on static data distribution and live with the related performance hits or to implement everything themselves, which is very tedious and difficult. We believe that the only viable way to flexibly support dynamic and adaptive locality control is to provide a global, uniform address space to the applications, even on distributed systems.

Prefer moving work to the data over moving data to the work#

For the best performance it seems obvious to minimize the amount of bytes transferred from one part of the system to another. This is true on all levels. At the lowest level we try to take advantage of processor memory caches, thus, minimizing memory latencies. Similarly, we try to amortize the data transfer time to and from GPGPUs as much as possible. At high levels we try to minimize data transfer between different nodes of a cluster or between different virtual machines on the cloud. Our experience (well, it’s almost common wisdom) shows that the amount of bytes necessary to encode a certain operation is very often much smaller than the amount of bytes encoding the data the operation is performed upon. Nevertheless, we still often transfer the data to a particular place where we execute the operation just to bring the data back to where it came from afterwards. As an example let’s look at the way we usually write our applications for clusters using MPI. This programming model is all about data transfer between nodes. MPI is the prevalent programming model for clusters, and it is fairly straightforward to understand and to use. Therefore, we often write applications in a way that accommodates this model, centered around data transfer. These applications usually work well for smaller problem sizes and for regular data structures. The larger the amount of data we have to churn and the more irregular the problem domain becomes, the worse the overall machine utilization and the (strong) scaling characteristics become. While it is not impossible to implement more dynamic, data driven, and asynchronous applications using MPI, it is somewhat difficult to do so. At the same time, if we look at applications that prefer to execute the code close to the locality where the data was placed, i.e., utilizing active messages (for instance based on Charm++), we see better asynchrony, simpler application codes, and improved scaling.

Favor message driven computation over message passing#

Today’s prevalently used programming model on parallel (multi-node) systems is MPI. It is based on message passing, as the name implies, which means that the receiver has to be aware of a message about to come in. Both codes, the sender and the receiver, have to synchronize in order to perform the communication step. Even the newer, asynchronous interfaces require explicitly coding the algorithms around the required communication scheme. As a result, everything but the most trivial MPI applications spends a considerable amount of time waiting for incoming messages, thus, causing starvation and latencies to impede full resource utilization. The more complex and more dynamic the data structures and algorithms become, the larger the adverse effects. The community discovered message-driven and data-driven methods of implementing algorithms a long time ago, and systems such as Charm++ have already integrated active messages demonstrating the validity of the concept. Message-driven computation allows for sending messages without requiring the receiver to actively wait for them. Any incoming message is handled asynchronously and triggers the encoded action by passing along arguments and—possibly—continuations. HPX combines this scheme with work-queue based scheduling as described above, which allows the system to almost completely overlap any communication with useful work, thereby minimizing latencies.

Additional material#

Overview#

HPX is organized into different sub-libraries and those in turn into modules. The libraries and modules are independent, with clear dependencies and no cycles. As an end-user, the use of these libraries is completely transparent. If you use e.g. add_hpx_executable to create a target in your project you will automatically get all modules as dependencies. See below for a list of the available libraries and modules. Currently these are nothing more than an internal grouping and do not affect usage. They cannot be consumed individually at the moment.

Note

There is a dependency report that displays useful information about the structure of the code. It is available for each commit at HPX Dependency report.

Core modules#

affinity#

The affinity module contains helper functionality for mapping worker threads to hardware resources.

See the API reference of the module for more details.

algorithms#

The algorithms module exposes the full set of algorithms defined by the C++ standard. There is also partial support for C++ ranges.

See the API reference of the module for more details.

allocator_support#

This module provides utilities for allocators. It contains hpx::util::internal_allocator which directly forwards allocation calls to jemalloc. This utility is is mainly useful on Windows.

See the API reference of the module for more details.

asio#

The asio module is a thin wrapper around the Boost.Asio library, providing a few additional helper functions.

See the API reference of the module for more details.

assertion#

The assertion library implements the macros HPX_ASSERT and HPX_ASSERT_MSG. Those two macros can be used to implement assertions which are turned of during a release build.

By default, the location and function where the assert has been called from are displayed when the assertion fires. This behavior can be modified by using hpx::assertion::set_assertion_handler. When HPX initializes, it uses this function to specify a more elaborate assertion handler. If your application needs to customize this, it needs to do so before calling hpx::init, hpx_main or using the C-main wrappers.

See the API reference of the module for more details.

async_base#

The async_base module defines the basic functionality for spawning tasks on thread pools. This module does not implement any functionality on its own, but is extended by async_local and async_distributed with implementations for the local and distributed cases.

See the API reference of this module for more details.

async_combinators#

This module contains combinators for futures. The when_* functions allow you to turn multiple futures into a single future which is ready when all, any, some, or each of the given futures are ready. The wait_* combinators are equivalent to the when_* functions except that they do not return a future. Those wait for all futures to become ready before returning to the user. Note that the wait_* functions will rethrow one of the exceptions from exceptional futures. The wait_*_nothrow combinators are equivalent to the wait_* functions exception that they do not throw if one of the futures has become exceptional.

The split_future combinator takes a single future of a container (e.g. tuple) and turns it into a container of futures.

See lcos_local, synchronization, and async_distributed for other synchronization facilities.

See the API reference of this module for more details.

async_cuda#

This library adds a simple API that enables the user to retrieve a future from a CUDA stream. Typically, a user may launch one or more kernels and then get a future from the stream that will become ready when those kernels have completed. It is important to note that multiple kernels may be launched without fetching a future, and multiple futures may be obtained from the helper. Please refer to the unit tests and examples for further examples.

See the API reference of this module for more details.

async_local#

This module extends async_base to provide local implementations of hpx::async, hpx::post, hpx::sync, and hpx::dataflow. The async_distributed module extends the functionality in this module to work with actions.

See the API reference of this module for more details.

async_mpi#

The MPI library is intended to simplify the process of integrating MPI based codes with the HPX runtime. Any MPI function that is asynchronous and uses an MPI_Request may be converted into an hpx::future. The syntax is designed to allow a simple replacement of the MPI call with a futurized async version that accepts an executor instead of a communicator, and returns a future instead of assigning a request. Typically, an MPI call of the form

int MPI_Isend(buf, count, datatype, rank, tag, comm, request);

becomes

hpx::future<int> f = hpx::async(executor, MPI_Isend, buf, count, datatype, rank, tag);

When the MPI operation is complete, the future will become ready. This allows communication to integrated cleanly with the rest of HPX, in particular the continuation style of programming may be used to build up more complex code. Consider the following example, that chains user processing, sends and receives using continuations…

// create an executor for MPI dispatch
hpx::mpi::experimental::executor exec(MPI_COMM_WORLD);

// post an asynchronous receive using MPI_Irecv
hpx::future<int> f_recv = hpx::async(
    exec, MPI_Irecv, &data, rank, MPI_INT, rank_from, i);

// attach a continuation to run when the recv completes,
f_recv.then([=, &tokens, &counter](auto&&)
{
    // call an application specific function
    msg_recv(rank, size, rank_to, rank_from, tokens[i], i);

    // send a new message
    hpx::future<int> f_send = hpx::async(
        exec, MPI_Isend, &tokens[i], 1, MPI_INT, rank_to, i);

    // when that send completes
    f_send.then([=, &tokens, &counter](auto&&)
    {
        // call an application specific function
        msg_send(rank, size, rank_to, rank_from, tokens[i], i);
    });
}

The example above makes use of MPI_Isend and MPI_Irecv, but any MPI function that uses requests may be futurized in this manner. The following is a (non exhaustive) list of MPI functions that should be supported, though not all have been tested at the time of writing (please report any problems to the issue tracker).

int MPI_Isend(...);
int MPI_Ibsend(...);
int MPI_Issend(...);
int MPI_Irsend(...);
int MPI_Irecv(...);
int MPI_Imrecv(...);
int MPI_Ibarrier(...);
int MPI_Ibcast(...);
int MPI_Igather(...);
int MPI_Igatherv(...);
int MPI_Iscatter(...);
int MPI_Iscatterv(...);
int MPI_Iallgather(...);
int MPI_Iallgatherv(...);
int MPI_Ialltoall(...);
int MPI_Ialltoallv(...);
int MPI_Ialltoallw(...);
int MPI_Ireduce(...);
int MPI_Iallreduce(...);
int MPI_Ireduce_scatter(...);
int MPI_Ireduce_scatter_block(...);
int MPI_Iscan(...);
int MPI_Iexscan(...);
int MPI_Ineighbor_allgather(...);
int MPI_Ineighbor_allgatherv(...);
int MPI_Ineighbor_alltoall(...);
int MPI_Ineighbor_alltoallv(...);
int MPI_Ineighbor_alltoallw(...);

Note that the HPX mpi futurization wrapper should work with any asynchronous MPI call, as long as the function signature has the last two arguments MPI_xxx(..., MPI_Comm comm, MPI_Request *request) - internally these two parameters will be substituted by the executor and future data parameters that are supplied by template instantiations inside the hpx::mpi code.

See the API reference of this module for more details.

async_sycl#

This module allows creating HPX futures using SYCL events, effectively integrating asynchronous SYCL kernels and memory transfers with HPX. Building on this integration, this module also contains a SYCL executor. This executor encapsulates a SYCL queue. When SYCL queue member functions are launched with this executor, the user can automatically obtain the HPX futures associated with them.

The creation of the HPX futures using SYCL events is based on the same event polling mechanism that the CUDA HPX integration uses. Each registered event gets an associated callback and gets inserted into a callback vector to be polled by the scheduler in between tasks. Once the polling reveals the event is complete, the callback will be called, which in turn sets the future to ready (see sycl_event_callback.cpp). There are multiple adaptions for HipSYCL for this: To keep the runtime alive (avoiding the repeated on-the-fly creation of the runtime during the polling), we keep a default queue. Furthermore, we flush the internal SYCL DAG to ensure that the launched SYCL function is actually being executed.

The SYCL executor offers the usual post and async_execute functions. Additionally, it contains two get_future functions. One expects a pre-existing SYCL event to return a future, the other one does not but will launch an empty SYCL kernel instead, to obtain an event (causing higher overhead for the sake of being more convenient). The post and async_execute implementations here are actually different for HipSYCL and OneAPI, since the sycl::queue in OneAPI uses a different interface (using a code_location parameter) which requires some adaptations here.

To make this module compile, we use the -fno-sycl and -fsycl compiler parameters for the OneAPI use-case (requiring HPX to be compiled with dpcpp). For HipSYCL we use its cmake integration instead (requiring HPX to be compiled with clang++ and including HipSYCL as a library).

To build with OneAPI, use the CMake Variable HPX_WITH_SYCL=ON. To build with HipSYCL, use HPX_WITH_SYCL=ON and HPX_WITH_HIPSYCL=ON (and make sure find_package will find HipSYCL).

Lastly, the module contains three tests/examples. All three implement a simple vector add example. The first one obtains a future using the free method get_future, the second one uses a single SYCL executor and the last one is using multiple executors called from multiple host threads.

To build the tests, use ” make tests.unit.modules.async_sycl ” To run the tests, use “ctest -R sycl”.

NOTE: Theoretically, this module could work with other SYCL implementations, but was only tested using OneAPI and HipSYCL so far.

See the API reference of this module for more details.

batch_environments#

This module allows for the detection of execution as batch jobs, a series of programs executed without user intervention. All data is preselected and will be executed according to preset parameters, such as date or completion of another task. Batch environments are especially useful for executing repetitive tasks.

HPX supports the creation of batch jobs through the Portable Batch System (PBS) and SLURM.

For more information on batch environments, see Running on batch systems and the API reference for the module.

cache#

This module provides two cache data structures:

See the API reference of the module for more details.

concepts#

This module provides helpers for emulating concepts. It provides the following macros:

  • HPX_CONCEPT_REQUIRES

  • HPX_HAS_MEMBER_XXX_TRAIT_DEF

  • HPX_HAS_XXX_TRAIT_DEF

See the API reference of the module for more details.

concurrency#

This module provides concurrency primitives useful for multi-threaded programming such as:

  • hpx::barrier

  • hpx::util::cache_line_data and hpx::util::cache_aligned_data: wrappers for aligning and padding data to cache lines.

  • various lockfree queue data structures

See the API reference of the module for more details.

config#

The config module contains various configuration options, typically hidden behind macros that choose the correct implementation based on the compiler and other available options. It also contains platform independent macros to control inlinining, export sets and more.

See the API reference of the module for more details.

config_registry#

The config_registry module is a low level module providing helper functionality for registering configuration entries to a global registry from other modules. The hpx::config_registry::add_module_config function is used to add configuration options, and hpx::config_registry::get_module_configs can be used to retrieve configuration entries registered so far. add_module_config_helper can be used to register configuration entries through static global options.

See the API reference of this module for more details.

coroutines#

The coroutines module provides coroutine (user-space thread) implementations for different platforms.

See the API reference of the module for more details.

datastructures#

The datastructures module provides basic data structures (typically provided for compatibility with older C++ standards):

See the API reference of the module for more details.

debugging#

This module provides helpers for demangling symbol names.

See the API reference of the module for more details.

errors#

This module provides support for exceptions and error codes:

See the API reference of the module for more details.

execution#

This library implements executors and execution policies for use with parallel algorithms and other facilities related to managing the execution of tasks.

See the API reference of the module for more details.

execution_base#

The basic execution module is the main entry point to implement parallel and concurrent operations. It is modeled after P0443 with some additions and implementations for the described concepts. Most notably, it provides an abstraction for execution resources, execution contexts and execution agents in such a way, that it provides customization points that those aforementioned concepts can be replaced and combined with ease.

For that purpose, three virtual base classes are provided to be able to provide implementations with different properties:

  • resource_base: This is the abstraction for execution resources, that is

    for example CPU cores or an accelerator.

  • context_base: An execution context uses execution resources and is able

    to spawn new execution agents, as new threads of executions on the available resources.

  • agent_base: The execution agent represents the thread of execution, and

    can be used to yield, suspend, resume or abort a thread of execution.

executors#

The executors module exposes executors and execution policies. Most importantly, it exposes the following classes and constants:

See the API reference of this module for more details.

filesystem#

This module provides a compatibility layer for the C++17 filesystem library. If the filesystem library is available this module will simply forward its contents into the hpx::filesystem namespace. If the library is not available it will fall back to Boost.Filesystem instead.

See the API reference of the module for more details.

format#

The format module exposes the format and format_to functions for formatting strings.

See the API reference of the module for more details.

functional#

This module provides function wrappers and helpers for managing functions and their arguments.

  • hpx::function

  • hpx::function_ref

  • hpx::move_only_function

  • hpx::bind

  • hpx::bind_back

  • hpx::bind_front

  • hpx::util::deferred_call

  • hpx::invoke

  • hpx::invoke_r

  • hpx::invoke_fused

  • hpx::invoke_fused_r

  • hpx::mem_fn

  • hpx::util::one_shot

  • hpx::util::protect

  • hpx::util::result_of

  • hpx::placeholders::_1

  • hpx::placeholders::_2

  • hpx::placeholders::_9

See the API reference of the module for more details.

futures#

This module defines the hpx::future and hpx::shared_future classes corresponding to the C++ standard library classes std::future and std::shared_future. Note that the specializations of hpx::future::then for executors and execution policies are defined in the execution module.

See the API reference of this module for more details.

hardware#

The hardware module abstracts away hardware specific details of timestamps and CPU features.

See the API reference of the module for more details.

hashing#

The hashing module provides two hashing implementations:

  • hpx::util::fibhash

  • hpx::util::jenkins_hash

See the API reference of the module for more details.

include_local#

This module provides no functionality in itself. Instead it provides headers that group together other headers that often appear together. This module provides local-only headers.

See the API reference of this module for more details.

io_service#

This module provides an abstraction over Boost.ASIO, combining multiple asio::io_contexts into a single pool. hpx::util::io_service_pool provides a simple pool of asio::io_contexts with an API similar to asio::io_context. hpx::threads::detail::io_service_thread_pool wraps hpx::util::io_service_pool into an interface derived from hpx::threads::detail::thread_pool_base.

See the API reference of this module for more details.

iterator_support#

This module provides helpers for iterators. It provides hpx::util::iterator_facade and hpx::util::iterator_adaptor for creating new iterators, and the trait hpx::util::is_iterator along with more specific iterator traits.

See the API reference of the module for more details.

itt_notify#

This module provides support for profiling with Intel VTune.

See the API reference of this module for more details.

lci_base#

This module provides helper functionality for detecting LCI environments.

See the API reference of this module for more details.

lcos_local#

This module provides the following local LCOs:

  • hpx::lcos::local::and_gate

  • hpx::lcos::local::channel

  • hpx::lcos::local::one_element_channel

  • hpx::lcos::local::receive_channel

  • hpx::lcos::local::send_channel

  • hpx::lcos::local::guard

  • hpx::lcos::local::guard_set

  • hpx::lcos::local::run_guarded

  • hpx::lcos::local::conditional_trigger

  • hpx::packaged_task

  • hpx::promise

  • hpx::lcos::local::receive_buffer

  • hpx::lcos::local::trigger

See lcos_distributed for distributed LCOs. Basic synchronization primitives for use in HPX threads can be found in synchronization. async_combinators contains useful utility functions for combining futures.

See the API reference of this module for more details.

lock_registration#

This module contains fucntionality for registering locks to detect when they are locked and unlocked on different threads.

See the API reference of this module for more details.

logging#

This module provides useful macros for logging information.

See the API reference of the module for more details.

memory#

Part of this module is a forked version of boost::intrusive_ptr from Boost.SmartPtr.

See the API reference of the module for more details.

mpi_base#

This module provides helper functionality for detecting MPI environments.

See the API reference of this module for more details.

pack_traversal#

This module exposes the basic functionality for traversing various packs, both synchronously and asynchronously: hpx::util::traverse_pack and hpx::util::traverse_pack_async. It also exposes the higher level functionality of unwrapping nested futures: hpx::util::unwrap and its function object form hpx::util::functional::unwrap.

See the API reference of this module for more details.

plugin#

This module provides base utilities for creating plugins.

See the API reference of the module for more details.

prefix#

This module provides utilities for handling the prefix of an HPX application, i.e. the paths used for searching components and plugins.

See the API reference of this module for more details.

preprocessor#

This library contains useful preprocessor macros:

See the API reference of the module for more details.

program_options#

The module program_options is a direct fork of the Boost.Program_options library (Boost V1.70.0). In order to be included as an HPX module, the Boost.Program_options library has been moved to the namespace hpx::program_options. We have also replaced all Boost facilities the library depends on with either the equivalent facilities from the standard library or from HPX. As a result, the HPX program_options module is fully interface compatible with Boost.Program_options (sans the hpx namespace and the #include <hpx/modules/program_options.hpp> changes that need to be applied to all code relying on this library).

All credit goes to Vladimir Prus, the author of the excellent Boost.Program_options library. All bugs have been introduced by us.

See the API reference of the module for more details.

properties#

This module implements the prefer customization point for properties in terms of P2220. This differs from P1393 in that it relies fully on tag_invoke overloads and fewer base customization points. Actual properties are defined in modules. All functionality is experimental and can be accessed through the hpx::experimental namespace.

See the API reference of this module for more details.

resiliency#

In HPX, a program failure is a manifestation of a failing task. This module exposes several APIs that allow users to manage failing tasks in a convenient way by either replaying a failed task or by replicating a specific task.

Task replay is analogous to the Checkpoint/Restart mechanism found in conventional execution models. The key difference being localized fault detection. When the runtime detects an error, it replays the failing task as opposed to completely rolling back the entire program to the previous checkpoint.

Task replication is designed to provide reliability enhancements by replicating a set of tasks and evaluating their results to determine a consensus among them. This technique is most effective in situations where there are few tasks in the critical path of the DAG which leaves the system underutilized or where hardware or software failures may result in an incorrect result instead of an error. However, the drawback of this method is the additional computational cost incurred by repeating a task multiple times.

The following API functions are exposed:

  • hpx::resiliency::experimental::async_replay: This version of task replay will catch user-defined exceptions and automatically reschedule the task N times before throwing an hpx::resiliency::experimental::abort_replay_exception if no task is able to complete execution without an exception.

  • hpx::resiliency::experimental::async_replay_validate: This version of replay adds an argument to async replay which receives a user-provided validation function to test the result of the task against. If the task’s output is validated, the result is returned. If the output fails the check or an exception is thrown, the task is replayed until no errors are encountered or the number of specified retries has been exceeded.

  • hpx::resiliency::experimental::async_replicate: This is the most basic implementation of the task replication. The API returns the first result that runs without detecting any errors.

  • hpx::resiliency::experimental::async_replicate_validate: This API additionally takes a validation function which evaluates the return values produced by the threads. The first task to compute a valid result is returned.

  • hpx::resiliency::experimental::async_replicate_vote: This API adds a vote function to the basic replicate function. Many hardware or software failures are silent errors which do not interrupt program flow. In order to detect errors of this kind, it is necessary to run the task several times and compare the values returned by every version of the task. In order to determine which return value is “correct”, the API allows the user to provide a custom consensus function to properly form a consensus. This voting function then returns the “correct”” answer.

  • hpx::resiliency::experimental::async_replicate_vote_validate: This combines the features of the previously discussed replicate set. Replicate vote validate allows a user to provide a validation function to filter results. Additionally, as described in replicate vote, the user can provide a “voting function” which returns the consensus formed by the voting logic.

  • hpx::resiliency::experimental::dataflow_replay: This version of dataflow replay will catch user-defined exceptions and automatically reschedules the task N times before throwing an hpx::resiliency::experimental::abort_replay_exception if no task is able to complete execution without an exception. Any arguments for the executed task that are futures will cause the task invocation to be delayed until all of those futures have become ready.

  • hpx::resiliency::experimental::dataflow_replay_validate : This version of replay adds an argument to dataflow replay which receives a user-provided validation function to test the result of the task against. If the task’s output is validated, the result is returned. If the output fails the check or an exception is thrown, the task is replayed until no errors are encountered or the number of specified retries have been exceeded. Any arguments for the executed task that are futures will cause the task invocation to be delayed until all of those futures have become ready.

  • hpx::resiliency::experimental::dataflow_replicate: This is the most basic implementation of the task replication. The API returns the first result that runs without detecting any errors. Any arguments for the executed task that are futures will cause the task invocation to be delayed until all of those futures have become ready.

  • hpx::resiliency::experimental::dataflow_replicate_validate: This API additionally takes a validation function which evaluates the return values produced by the threads. The first task to compute a valid result is returned. Any arguments for the executed task that are futures will cause the task invocation to be delayed until all of those futures have become ready.

  • hpx::resiliency::experimental::dataflow_replicate_vote: This API adds a vote function to the basic replicate function. Many hardware or software failures are silent errors which do not interrupt program flow. In order to detect errors of this kind, it is necessary to run the task several times and compare the values returned by every version of the task. In order to determine which return value is “correct”, the API allows the user to provide a custom consensus function to properly form a consensus. This voting function then returns the “correct” answer. Any arguments for the executed task that are futures will cause the task invocation to be delayed until all of those futures have become ready.

  • hpx::resiliency::experimental::dataflow_replicate_vote_validate: This combines the features of the previously discussed replicate set. Replicate vote validate allows a user to provide a validation function to filter results. Additionally, as described in replicate vote, the user can provide a “voting function” which returns the consensus formed by the voting logic. Any arguments for the executed task that are futures will cause the task invocation to be delayed until all of those futures have become ready.

See the API reference of the module for more details.

resource_partitioner#

The resource_partitioner module defines hpx::resource::partitioner, the class used by the runtime and users to partition available hardware resources into thread pools. See Using the resource partitioner for more details on using the resource partitioner in applications.

See the API reference of this module for more details.

runtime_configuration#

This module handles the configuration options required by the runtime.

See the API reference of this module for more details.

schedulers#

This module provides schedulers used by thread pools in the thread_pools module. There are currently three main schedulers:

  • hpx::threads::policies::local_priority_queue_scheduler

  • hpx::threads::policies::static_priority_queue_scheduler

  • hpx::threads::policies::shared_priority_queue_scheduler

Other schedulers are specializations or variations of the above schedulers. See the examples of the resource_partitioner module for examples of specifying a custom scheduler for a thread pool.

See the API reference of this module for more details.

serialization#

This module provides serialization primitives and support for all built-in types as well as all C++ Standard Library collection and utility types. This list is extended by HPX vocabulary types with proper support for global reference counting. HPX’s mode of serialization is derived from Boost’s serialization model and, as such, is mostly interface compatible with its Boost counterpart.

The purest form of serializing data is to copy the content of the payload bit by bit; however, this method is impractical for generic C++ types, which might be composed of more than just regular built-in types. Instead, HPX’s approach to serialization is derived from the Boost Serialization library, and is geared towards allowing the programmer of a given class explicit control and syntax of what to serialize. It is based on operator overloading of two special archive types that hold a buffer or stream to store the serialized data and is responsible for dispatching the serialization mechanism to the intrusive or non-intrusive version. The serialization process is recursive. Each member that needs to be serialized must be specified explicitly. The advantage of this approach is that the serialization code is written in C++ and leverages all necessary programming techniques. The generic, user-facing interface allows for effective application of the serialization process without obstructing the algorithms that need special code for packing and unpacking. It also allows for optimizations in the implementation of the archives.

See the API reference of the module for more details.

static_reinit#

This module provides a simple wrapper around static variables that can be reinitialized.

See the API reference of this module for more details.

string_util#

This module contains string utilities inspired by the Boost String Algorithms Library.

See the API reference of this module for more details.

synchronization#

This module provides synchronization primitives that should be used rather than the C++ standard ones in HPX threads:

See lcos_local, async_combinators, and async_distributed for higher level synchronization facilities.

See the API reference of this module for more details.

testing#

The testing module contains useful macros for testing. The results of tests can be printed with hpx::util::report_errors. The following macros are provided:

  • HPX_TEST

  • HPX_TEST_MSG

  • HPX_TEST_EQ

  • HPX_TEST_NEQ

  • HPX_TEST_LT

  • HPX_TEST_LTE

  • HPX_TEST_RANGE

  • HPX_TEST_EQ_MSG

  • HPX_TEST_NEQ_MSG

  • HPX_SANITY

  • HPX_SANITY_MSG

  • HPX_SANITY_EQ

  • HPX_SANITY_NEQ

  • HPX_SANITY_LT

  • HPX_SANITY_LTE

  • HPX_SANITY_RANGE

  • HPX_SANITY_EQ_MSG

See the API reference of the module for more details.

thread_pool_util#

This module contains helper functions for asynchronously suspending and resuming thread pools and their worker threads.

See the API reference of this module for more details.

thread_pools#

This module defines the thread pools and utilities used by the HPX runtime. The only thread pool implementation provided by this module is hpx::threads::detail::scheduled_thread_pool, which is derived from hpx::threads::detail::thread_pool_base defined in the threading_base module.

See the API reference of this module for more details.

thread_support#

This module provides miscellaneous utilities for threading and concurrency.

See the API reference of the module for more details.

threading#

This module provides the equivalents of std::thread and std::jthread for lightweight HPX threads:

See the API reference of this module for more details.

threading_base#

This module contains the base class definition required for threads. The base class hpx::threads::thread_data is inherited by two specializations for stackful and stackless threads: hpx::threads::thread_data_stackful and hpx::threads::thread_data_stackless. In addition, the module defines the base classes for schedulers and thread pools: hpx::threads::policies::scheduler_base and hpx::threads::thread_pool_base.

See the API reference of this module for more details.

thread_manager#

This module defines the hpx::threads::threadmanager class. This is used by the runtime to manage the creation and destruction of thread pools. The resource_partitioner module handles the partitioning of resources into thread pools, but not the creation of thread pools.

See the API reference of this module for more details.

timed_execution#

This module provides extensions to the executor interfaces defined in the execution module that allow timed submission of tasks on thread pools (at or after a specified time).

See the API reference of this module for more details.

timing#

This module provides the timing utilities (clocks and timers).

See the API reference of the module for more details.

topology#

This module provides the class hpx::threads::topology which represents the hardware resources available on a node. The class is a light wrapper around the Portable Hardware Locality (HWLOC) library. The hpx::threads::cpu_mask is a small companion class that represents a set of resources on a node.

See the API reference of the module for more details.

type_support#

This module provides helper facilities related to types.

See the API reference of the module for more details.

util#

The util module provides miscellaneous standalone utilities.

See the API reference of the module for more details.

version#

This module macros and functions for accessing version information about HPX and its dependencies.

See the API reference of this module for more details.

Main HPX modules#

actions#

TODO: High-level description of the library.

See the API reference of this module for more details.

actions_base#

TODO: High-level description of the library.

See the API reference of this module for more details.

agas#

TODO: High-level description of the module.

See the API reference of this module for more details.

agas_base#

This module holds the implementation of the four AGAS services: primary namespace, locality namespace, component namespace, and symbol namespace.

See the API reference of this module for more details.

async_colocated#

TODO: High-level description of the module.

See the API reference of this module for more details.

async_distributed#

This module contains functionality for asynchronously launching work on remote localities: hpx::async, hpx::post. This module extends the local-only functions in libs_async_local.

See the API reference of this module for more details.

checkpoint#

A common need of users is to periodically backup an application. This practice provides resiliency and potential restart points in code. HPX utilizes the concept of a checkpoint to support this use case.

Found in hpx/util/checkpoint.hpp, checkpoints are defined as objects that hold a serialized version of an object or set of objects at a particular moment in time. This representation can be stored in memory for later use or it can be written to disk for storage and/or recovery at a later point. In order to create and fill this object with data, users must use a function called save_checkpoint. In code the function looks like this:

hpx::future<hpx::util::checkpoint> hpx::util::save_checkpoint(a, b, c, ...);

save_checkpoint takes arbitrary data containers, such as int, double, float, vector, and future, and serializes them into a newly created checkpoint object. This function returns a future to a checkpoint containing the data. Here’s an example of a simple use case:

using hpx::util::checkpoint;
using hpx::util::save_checkpoint;

std::vector<int> vec{1,2,3,4,5};
hpx::future<checkpoint> save_checkpoint(vec);

Once the future is ready, the checkpoint object will contain the vector vec and its five elements.

prepare_checkpoint takes arbitrary data containers (same as for save_checkpoint), , such as int, double, float, vector, and future, and calculates the necessary buffer space for the checkpoint that would be created if save_checkpoint was called with the same arguments. This function returns a future to a checkpoint that is appropriately initialized. Here’s an example of a simple use case:

using hpx::util::checkpoint;
using hpx::util::prepare_checkpoint;

std::vector<int> vec{1,2,3,4,5};
hpx::future<checkpoint> prepare_checkpoint(vec);

Once the future is ready, the checkpoint object will be initialized with an appropriately sized internal buffer.

It is also possible to modify the launch policy used by save_checkpoint. This is accomplished by passing a launch policy as the first argument. It is important to note that passing hpx::launch::sync will cause save_checkpoint to return a checkpoint instead of a future to a checkpoint. All other policies passed to save_checkpoint will return a future to a checkpoint.

Sometimes checkpoint s must be declared before they are used. save_checkpoint allows users to move pre-created checkpoint s into the function as long as they are the first container passing into the function (In the case where a launch policy is used, the checkpoint will immediately follow the launch policy). An example of these features can be found below:

    char character = 'd';
    int integer = 10;
    float flt = 10.01f;
    bool boolean = true;
    std::string str = "I am a string of characters";
    std::vector<char> vec(str.begin(), str.end());
    checkpoint archive;

    // Test 1
    //  test basic functionality
    hpx::shared_future<checkpoint> f_archive = save_checkpoint(
        std::move(archive), character, integer, flt, boolean, str, vec);

Once users can create checkpoints they must now be able to restore the objects they contain into memory. This is accomplished by the function restore_checkpoint. This function takes a checkpoint and fills its data into the containers it is provided. It is important to remember that the containers must be ordered in the same way they were placed into the checkpoint. For clarity see the example below:

    char character2;
    int integer2;
    float flt2;
    bool boolean2;
    std::string str2;
    std::vector<char> vec2;

    restore_checkpoint(data, character2, integer2, flt2, boolean2, str2, vec2);

The core utility of checkpoint is in its ability to make certain data persistent. Often, this means that the data needs to be stored in an object, such as a file, for later use. HPX has two solutions for these issues: stream operator overloads and access iterators.

HPX contains two stream overloads, operator<< and operator>>, to stream data out of and into checkpoint. Here is an example of the overloads in use below:

    double a9 = 1.0, b9 = 1.1, c9 = 1.2;
    std::ofstream test_file_9("test_file_9.txt");
    hpx::future<checkpoint> f_9 = save_checkpoint(a9, b9, c9);
    test_file_9 << f_9.get();
    test_file_9.close();

    double a9_1, b9_1, c9_1;
    std::ifstream test_file_9_1("test_file_9.txt");
    checkpoint archive9;
    test_file_9_1 >> archive9;
    restore_checkpoint(archive9, a9_1, b9_1, c9_1);

This is the primary way to move data into and out of a checkpoint. It is important to note, however, that users should be cautious when using a stream operator to load data and another function to remove it (or vice versa). Both operator<< and operator>> rely on a .write() and a .read() function respectively. In order to know how much data to read from the std::istream, the operator<< will write the size of the checkpoint before writing the checkpoint data. Correspondingly, the operator>> will read the size of the stored data before reading the data into a new instance of checkpoint. As long as the user employs the operator<< and operator>> to stream the data, this detail can be ignored.

Important

Be careful when mixing operator<< and operator>> with other facilities to read and write to a checkpoint. operator<< writes an extra variable, and operator>> reads this variable back separately. Used together the user will not encounter any issues and can safely ignore this detail.

Users may also move the data into and out of a checkpoint using the exposed .begin() and .end() iterators. An example of this use case is illustrated below.

    std::ofstream test_file_7("checkpoint_test_file.txt");
    std::vector<float> vec7{1.02f, 1.03f, 1.04f, 1.05f};
    hpx::future<checkpoint> fut_7 = save_checkpoint(vec7);
    checkpoint archive7 = fut_7.get();
    std::copy(archive7.begin(),    // Write data to ofstream
        archive7.end(),            // ie. the file
        std::ostream_iterator<char>(test_file_7));
    test_file_7.close();

    std::vector<float> vec7_1;
    std::vector<char> char_vec;
    std::ifstream test_file_7_1("checkpoint_test_file.txt");
    if (test_file_7_1)
    {
        test_file_7_1.seekg(0, test_file_7_1.end);
        auto length = test_file_7_1.tellg();
        test_file_7_1.seekg(0, test_file_7_1.beg);
        char_vec.resize(length);
        test_file_7_1.read(char_vec.data(), length);
    }
    checkpoint archive7_1(std::move(char_vec));    // Write data to checkpoint
    restore_checkpoint(archive7_1, vec7_1);
Checkpointing components#

save_checkpoint and restore_checkpoint are also able to store components inside checkpoints. This can be done in one of two ways. First a client of the component can be passed to save_checkpoint. When the user wishes to resurrect the component she can pass a client instance to restore_checkpoint.

This technique is demonstrated below:

    // Try to checkpoint and restore a component with a client
    std::vector<int> vec3{10, 10, 10, 10, 10};

    // Create a component instance through client constructor
    data_client D(hpx::find_here(), std::move(vec3));
    hpx::future<checkpoint> f3 = save_checkpoint(D);

    // Create a new client
    data_client E;

    // Restore server inside client instance
    restore_checkpoint(f3.get(), E);

The second way a user can save a component is by passing a shared_ptr to the component to save_checkpoint. This component can be resurrected by creating a new instance of the component type and passing a shared_ptr to the new instance to restore_checkpoint.

This technique is demonstrated below:

    // test checkpoint a component using a shared_ptr
    std::vector<int> vec{1, 2, 3, 4, 5};
    data_client A(hpx::find_here(), std::move(vec));

    // Checkpoint Server
    hpx::id_type old_id = A.get_id();

    hpx::future<std::shared_ptr<data_server>> f_a_ptr =
        hpx::get_ptr<data_server>(A.get_id());
    std::shared_ptr<data_server> a_ptr = f_a_ptr.get();
    hpx::future<checkpoint> f = save_checkpoint(a_ptr);
    auto&& data = f.get();

    // test prepare_checkpoint API
    checkpoint c = prepare_checkpoint(hpx::launch::sync, a_ptr);
    HPX_TEST(c.size() == data.size());

    // Restore Server
    // Create a new server instance
    std::shared_ptr<data_server> b_server;
    restore_checkpoint(data, b_server);
checkpoint_base#

The checkpoint_base module contains lower level facilities that wrap simple check-pointing capabilities. This module does not implement special handling for futures or components, but simply serializes all arguments to or from a given container.

This module exposes the hpx::util::save_checkpoint_data, hpx::util::restore_checkpoint_data, and hpx::util::prepare_checkpoint_data APIs. These functions encapsulate the basic serialization functionalities necessary to save/restore a variadic list of arguments to/from a given data container.

See the API reference of this module for more details.

collectives#

The collectives module exposes a set of distributed collective operations. Those can be used to exchange data between participating sites in a coordinated way. At this point the module exposes the following collective primitives:

See the API reference of the module for more details.

command_line_handling#

The command_line_handling module defines and handles the command-line options required by the HPX runtime, combining them with configuration options defined by the runtime_configuration module. The actual parsing of command line options is handled by the program_options module.

See the API reference of the module for more details.

components#

TODO: High-level description of the module.

See the API reference of this module for more details.

components_base#

TODO: High-level description of the library.

See the API reference of this module for more details.

compute#

The compute module provides utilities for handling task and memory affinity on host systems.

See the API reference of the module for more details.

distribution_policies#

TODO: High-level description of the module.

See the API reference of this module for more details.

executors_distributed#

This module provides the executor hpx::parallel::execution::disribution_policy_executor. It allows one to create work that is implicitly distributed over multiple localities.

See the API reference of this module for more details.

include#

This module provides no functionality in itself. Instead it provides headers that group together other headers that often appear together.

See the API reference of this module for more details.

init_runtime#

TODO: High-level description of the library.

See the API reference of this module for more details.

lcos_distributed#

This module contains distributed LCOs. Currently the only LCO provided is :cpp:class::hpx::lcos::channel, a construct for sending values from one locality to another. See libs_lcos_local for local LCOs.

See the API reference of this module for more details.

naming#

TODO: High-level description of the module.

See the API reference of this module for more details.

naming_base#

This module provides a forward declaration of address_type, component_type and invalid_locality_id.

See the API reference of this module for more details.

parcelport_lci#

TODO: High-level description of the module.

See the API reference of this module for more details.

parcelport_mpi#

TODO: High-level description of the module.

See the API reference of this module for more details.

parcelport_tcp#

TODO: High-level description of the module.

See the API reference of this module for more details.

parcelset#

TODO: High-level description of the module.

See the API reference of this module for more details.

parcelset_base#

TODO: High-level description of the module.

See the API reference of this module for more details.

performance_counters#

This module provides the basic functionality required for defining performance counters. See Performance counters for more information about performance counters.

See the API reference of this module for more details.

plugin_factories#

TODO: High-level description of the module.

See the API reference of this module for more details.

resiliency_distributed#

Software resiliency features of HPX were introduced in the resiliency module. This module extends the APIs to run on distributed-memory systems allowing the user to invoke the failing task on other localities at runtime. This is useful in cases where a node is identified to fail more often (e.g., for certain ALU computes) as the task can now be replayed or replicated among different localities. The API exposed allows for an easy integration with the local only resiliency APIs as well.

Distributed software resilience APIs have a similar function signature and lives under the same namespace of hpx::resiliency::experimental. The difference arises in the formal parameters where distributed APIs takes the localities as the first argument, and an action as opposed to a function or a function object. The localities signify the order in which the API will either schedule (in case of Task Replay) tasks in a round robin fashion or replicate the tasks onto the list of localities.

The list of APIs exposed by distributed resiliency modules is the same as those defined in local resiliency module.

See the API reference of this module for more details.

runtime_components#

TODO: High-level description of the module.

See the API reference of this module for more details.

runtime_distributed#

TODO: High-level description of the module.

See the API reference of this module for more details.

segmented_algorithms#

Segmented algorithms extend the usual parallel algorithms by providing overloads that work with distributed containers, such as partitioned vectors.

See the API reference of the module for more details.

statistics#

This module provide some statistics utilities like rolling min/max and histogram.

See the API reference of the module for more details.

API reference#

HPX follows a versioning scheme with three numbers: major.minor.patch. We guarantee no breaking changes in the API for patch releases. Minor releases may remove or break existing APIs, but only after a deprecation period of at least two minor releases. In rare cases do we outright remove old and unused functionality without a deprecation period.

We do not provide any ABI compatibility guarantees between any versions, debug and release builds, and builds with different C++ standards.

The public API of HPX is presented below. Clicking on a name brings you to the full documentation for the class or function. Including the header specified in a heading brings in the features listed under that heading.

Note

Names listed here are guaranteed stable with respect to semantic versioning. However, at the moment the list is incomplete and certain unlisted features are intended to be in the public API. While we work on completing the list, if you’re unsure about whether a particular unlisted name is part of the public API you can get into contact with us or open an issue and we’ll clarify the situation.

Public API#

Our API is semantically conforming; hence, the reader is highly encouraged to refer to the corresponding facility in the C++ Standard if needed. All names below are also available in the top-level hpx namespace unless otherwise noted. The names in hpx should be preferred. The names in sub-namespaces will eventually be removed.

hpx/algorithm.hpp#

The header hpx/algorithm.hpp corresponds to the C++ standard library header algorithm. See Using parallel algorithms for more information about the parallel algorithms.

Classes#
Table 122 Classes of header hpx/algorithm.hpp#

Class

C++ standard

hpx::experimental::reduction

N4808

hpx::experimental::induction

N4808

Functions#
Table 123 hpx functions of header hpx/algorithm.hpp#

hpx function

C++ standard

hpx::adjacent_find

std::adjacent_find

hpx::all_of

std::all_of

hpx::any_of

std::any_of

hpx::copy

std::copy

hpx::copy_if

std::copy_if

hpx::copy_n

std::copy_n

hpx::count

std::count

hpx::count_if

std::count_if

hpx::ends_with

std::ends_with

hpx::equal

std::equal

hpx::fill

std::fill

hpx::fill_n

std::fill_n

hpx::find

std::find

hpx::find_end

std::find_end

hpx::find_first_of

std::find_first_of

hpx::find_if

std::find_if

hpx::find_if_not

std::find_if_not

hpx::for_each

std::for_each

hpx::for_each_n

std::for_each_n

hpx::generate

std::generate

hpx::generate_n

std::generate_n

hpx::includes

std::includes

hpx::inplace_merge

std::inplace_merge

hpx::is_heap

std::is_heap

hpx::is_heap_until

std::is_heap_until

hpx::is_partitioned

std::is_partitioned

hpx::is_sorted

std::is_sorted

hpx::is_sorted_until

std::is_sorted_until

hpx::lexicographical_compare

std::lexicographical_compare

hpx::make_heap

std::make_heap

hpx::max_element

std::max_element

hpx::merge

std::merge

hpx::min_element

std::min_element

hpx::minmax_element

std::minmax_element

hpx::mismatch

std::mismatch

hpx::move

std::move

hpx::none_of

std::none_of

hpx::nth_element

std::nth_element

hpx::partial_sort

std::partial_sort

hpx::partial_sort_copy

std::partial_sort_copy

hpx::partition

std::partition

hpx::partition_copy

std::partition_copy

hpx::experimental::reduce_by_key

reduce_by_key

hpx::remove

std::remove

hpx::remove_copy

std::remove_copy

hpx::remove_copy_if

std::remove_copy_if

hpx::remove_if

std::remove_if

hpx::replace

std::replace

hpx::replace_copy

std::replace_copy

hpx::replace_copy_if

std::replace_copy_if

hpx::replace_if

std::replace_if

hpx::reverse

std::reverse

hpx::reverse_copy

std::reverse_copy

hpx::rotate

std::rotate

hpx::rotate_copy

std::rotate_copy

hpx::search

std::search

hpx::search_n

std::search_n

hpx::set_difference

std::set_difference

hpx::set_intersection

std::set_intersection

hpx::set_symmetric_difference

std::set_symmetric_difference

hpx::set_union

std::set_union

hpx::shift_left

std::shift_left

hpx::shift_right

std::shift_right

hpx::sort

std::sort

hpx::experimental::sort_by_key

sort_by_key

hpx::stable_partition

std::stable_partition

hpx::stable_sort

std::stable_sort

hpx::starts_with

std::starts_with

hpx::swap_ranges

std::swap_ranges

hpx::transform

std::transform

hpx::unique

std::unique

hpx::unique_copy

std::unique_copy

hpx::experimental::for_loop

N4808

hpx::experimental::for_loop_strided

N4808

hpx::experimental::for_loop_n

N4808

hpx::experimental::for_loop_n_strided

N4808

Table 124 hpx::ranges functions of header hpx/algorithm.hpp#

hpx::ranges function

C++ standard

hpx::ranges::adjacent_find

std::adjacent_find

hpx::ranges::all_of

std::all_of

hpx::ranges::any_of

std::any_of

hpx::ranges::copy

std::copy

hpx::ranges::copy_if

std::copy_if

hpx::ranges::copy_n

std::copy_n

hpx::ranges::count

std::count

hpx::ranges::count_if

std::count_if

hpx::ranges::ends_with

std::ends_with

hpx::ranges::equal

std::equal

hpx::ranges::fill

std::fill

hpx::ranges::fill_n

std::fill_n

hpx::ranges::find

std::find

hpx::ranges::find_end

std::find_end

hpx::ranges::find_first_of

std::find_first_of

hpx::ranges::find_if

std::find_if

hpx::ranges::find_if_not

std::find_if_not

hpx::ranges::for_each

std::for_each

hpx::ranges::for_each_n

std::for_each_n

hpx::ranges::generate

std::generate

hpx::ranges::generate_n

std::generate_n

hpx::ranges::includes

std::includes

hpx::ranges::inplace_merge

std::inplace_merge

hpx::ranges::is_heap

std::is_heap

hpx::ranges::is_heap_until

std::is_heap_until

hpx::ranges::is_partitioned

std::is_partitioned

hpx::ranges::is_sorted

std::is_sorted

hpx::ranges::is_sorted_until

std::is_sorted_until

hpx::ranges::make_heap

std::make_heap

hpx::ranges::max_element

std::max_element

hpx::ranges::merge

std::merge

hpx::ranges::min_element

std::min_element

hpx::ranges::minmax_element

std::minmax_element

hpx::ranges::mismatch

std::mismatch

hpx::ranges::move

std::move

hpx::ranges::none_of

std::none_of

hpx::ranges::nth_element

std::nth_element

hpx::ranges::partial_sort

std::partial_sort

hpx::ranges::partial_sort_copy

std::partial_sort_copy

hpx::ranges::partition

std::partition

hpx::ranges::partition_copy

std::partition_copy

hpx::ranges::set_difference

std::set_difference

hpx::ranges::set_intersection

std::set_intersection

hpx::ranges::set_symmetric_difference

std::set_symmetric_difference

hpx::ranges::set_union

std::set_union

hpx::ranges::shift_left

P2440

hpx::ranges::shift_right

P2440

hpx::ranges::sort

std::sort

hpx::ranges::stable_partition

std::stable_partition

hpx::ranges::stable_sort

std::stable_sort

hpx::ranges::starts_with

std::starts_with

hpx::ranges::swap_ranges

std::swap_ranges

hpx::ranges::transform

std::transform

hpx::ranges::unique

std::unique

hpx::ranges::unique_copy

std::unique_copy

hpx::ranges::experimental::for_loop

N4808

hpx::ranges::experimental::for_loop_strided

N4808

hpx/any.hpp#

The header hpx/any.hpp corresponds to the C++ standard library header any.

hpx::any is compatible with std::any.

Classes#
Table 125 Classes of header hpx/any.hpp#

Class

C++ standard

hpx::any

std::any

hpx::any_nonser

hpx::bad_any_cast

std::bad_any_cast

hpx::unique_any_nonser

Functions#
Table 126 Functions of header hpx/any.hpp#

Function

C++ standard

hpx::any_cast

std::any_cast

hpx::make_any

std::make_any

hpx::make_any_nonser

hpx::make_unique_any_nonser

hpx/assert.hpp#

The header hpx/assert.hpp corresponds to the C++ standard library header cassert.

HPX_ASSERT is the HPX equivalent to assert in cassert. HPX_ASSERT can also be used in CUDA device code.

Macros#
Table 127 Macros of header hpx/assert.hpp#

Macro

HPX_ASSERT

HPX_ASSERT_MSG

hpx/barrier.hpp#

The header hpx/barrier.hpp corresponds to the C++ standard library header barrier and contains a distributed barrier implementation. This functionality is also exposed through the hpx::distributed namespace. The name in hpx::distributed should be preferred.

Classes#
Table 128 Classes of header hpx/barrier.hpp#

Class

C++ standard

hpx::barrier

std::barrier

Table 129 Distributed implementation of classes of header hpx/barrier.hpp#

Class

hpx::distributed::barrier

hpx/channel.hpp#

The header hpx/channel.hpp contains a local and a distributed channel implementation. This functionality is also exposed through the hpx::distributed namespace. The name in hpx::distributed should be preferred.

Classes#
Table 130 Classes of header hpx/channel.hpp#

Class

hpx::channel

Table 131 Distributed implementation of classes of header hpx/channel.hpp#

Class

hpx::distributed::channel

hpx/chrono.hpp#

The header hpx/chrono.hpp corresponds to the C++ standard library header chrono. The following replacements and extensions are provided compared to chrono.

Classes#
Table 132 Classes of header hpx/chrono.hpp#

Class

C++ standard

hpx::chrono::high_resolution_clock

std::high_resolution_clock

hpx::chrono::high_resolution_timer

hpx::chrono::steady_time_point

std::time_point

hpx/condition_variable.hpp#

The header hpx/condition_variable.hpp corresponds to the C++ standard library header condition_variable.

Classes#
Table 133 Classes of header hpx/condition_variable.hpp#

Class

C++ standard

hpx::condition_variable

std::condition_variable

hpx::condition_variable_any

std::condition_variable_any

hpx::cv_status

std::cv_status

hpx/exception.hpp#

The header hpx/exception.hpp corresponds to the C++ standard library header exception. hpx::exception extends std::exception and is the base class for all exceptions thrown in HPX. HPX_THROW_EXCEPTION can be used to throw HPX exceptions with file and line information attached to the exception.

Macros#
Classes#
Table 134 Classes of header hpx/exception.hpp#

Class

C++ standard

hpx::exception

std::exception

hpx/execution.hpp#

The header hpx/execution.hpp corresponds to the C++ standard library header execution. See High level parallel facilities, Using parallel algorithms and Executor parameters and executor parameter traits for more information about execution policies and executor parameters.

Note

These names are only available in the hpx::execution namespace, not in the top-level hpx namespace.

Constants#
Table 135 Constants of header hpx/execution.hpp#

Constant

C++ standard

hpx::execution::seq

std::execution_policy_tag

hpx::execution::par

std::execution_policy_tag

hpx::execution::par_unseq

std::execution_policy_tag

hpx::execution::task

Classes#
Table 136 Classes of header hpx/execution.hpp#

Class

C++ standard

hpx::execution::sequenced_policy

std::execution_policy_tag_t

hpx::execution::parallel_policy

std::execution_policy_tag_t

hpx::execution::parallel_unsequenced_policy

std::execution_policy_tag_t

hpx::execution::sequenced_task_policy

hpx::execution::parallel_task_policy

hpx::execution::experimental::auto_chunk_size

hpx::execution::experimental::dynamic_chunk_size

hpx::execution::experimental::guided_chunk_size

hpx::execution::experimental::persistent_auto_chunk_size

hpx::execution::experimental::static_chunk_size

hpx::execution::experimental::num_cores

hpx/functional.hpp#

The header hpx/functional.hpp corresponds to the C++ standard library header functional. hpx::function is a more efficient and serializable replacement for std::function.

Constants#

The following constants correspond to the C++ standard std::placeholders

Table 137 Constants of header hpx/functional.hpp#

Constant

hpx::placeholders::_1

hpx::placeholders::_2

hpx::placeholders::_9

Classes#
Table 138 Classes of header hpx/functional.hpp#

Class

C++ standard

hpx::function

std::function

hpx::function_ref

P0792

hpx::move_only_function

std::move_only_function

hpx::is_bind_expression

std::is_bind_expression

hpx::is_placeholder

std::is_placeholder

hpx::scoped_annotation

Functions#
Table 139 Functions of header hpx/functional.hpp#

Function

C++ standard

hpx::annotated_function

hpx::bind

std::bind

hpx::bind_back

std::bind_front

hpx::bind_front

std::bind_front

hpx::invoke

std::invoke

hpx::invoke_fused

std::apply

hpx::invoke_fused_r

hpx::mem_fn

std::mem_fn

hpx/future.hpp#

The header hpx/future.hpp corresponds to the C++ standard library header future. See Extended facilities for futures for more information about extensions to futures compared to the C++ standard library.

This header file also contains overloads of hpx::async, hpx::post, hpx::sync, and hpx::dataflow that can be used with actions. See Action invocation for more information about invoking actions.

Classes#
Table 140 Classes of header hpx/future.hpp#

Class

C++ standard

hpx::future

std::future

hpx::shared_future

std::shared_future

hpx::promise

std::promise

hpx::launch

std::launch

hpx::packaged_task

std::packaged_task

Note

All names except hpx::promise are also available in the top-level hpx namespace. hpx::promise refers to hpx::distributed::promise, a distributed variant of hpx::promise, but will eventually refer to hpx::promise after a deprecation period.

Table 141 Distributed implementation of classes of header hpx/future.hpp#

Class

hpx::distributed::promise

Functions#
Table 142 Functions of header hpx/future.hpp#

Function

C++ standard

hpx::async

std::async

hpx::post

hpx::sync

hpx::dataflow

hpx::make_future

hpx::make_shared_future

hpx::make_ready_future

P0159

hpx::make_ready_future_alloc

hpx::make_ready_future_at

hpx::make_ready_future_after

hpx::make_exceptional_future

P0159

hpx::when_all

P0159

hpx::when_any

P0159

hpx::when_some

hpx::when_each

hpx::wait_all

hpx::wait_any

hpx::wait_some

hpx::wait_each

hpx/init.hpp#

The header hpx/init.hpp contains functionality for starting, stopping, suspending, and resuming the HPX runtime. This is the main way to explicitly start the HPX runtime. See Starting the HPX runtime for more details on starting the HPX runtime.

Classes#
Table 143 Classes of header hpx/init.hpp#

Class

hpx::init_params

hpx::runtime_mode

Functions#
Table 144 Functions of header hpx/init.hpp#

Function

hpx::init

hpx::start

hpx::finalize

hpx::disconnect

hpx::suspend

hpx::resume

hpx/latch.hpp#

The header hpx/latch.hpp corresponds to the C++ standard library header latch. It contains a local and a distributed latch implementation. This functionality is also exposed through the hpx::distributed namespace. The name in hpx::distributed should be preferred.

Classes#
Table 145 Classes of header hpx/latch.hpp#

Class

C++ standard

hpx::latch

std::latch

Table 146 Distributed implementation of classes of header hpx/latch.hpp#

Class

hpx::distributed::latch

hpx/mutex.hpp#

The header hpx/mutex.hpp corresponds to the C++ standard library header mutex.

Classes#
Table 147 Classes of header hpx/mutex.hpp#

Class

C++ standard

hpx::mutex

std::mutex

hpx::no_mutex

hpx::once_flag

std::once_flag

hpx::recursive_mutex

std::recursive_mutex

hpx::spinlock

hpx::timed_mutex

std::timed_mutex

hpx::unlock_guard

Functions#
Table 148 Functions of header hpx/mutex.hpp#

Class

C++ standard

hpx::call_once

std::call_once

hpx/memory.hpp#

The header hpx/memory.hpp corresponds to the C++ standard library header memory. It contains parallel versions of the copy, fill, move, and construct helper functions in memory. See Using parallel algorithms for more information about the parallel algorithms.

Functions#
Table 149 hpx functions of header hpx/memory.hpp#

hpx function

C++ standard

hpx::uninitialized_copy

std::uninitialized_copy

hpx::uninitialized_copy_n

std::uninitialized_copy_n

hpx::uninitialized_default_construct

std::uninitialized_default_construct

hpx::uninitialized_default_construct_n

std::uninitialized_default_construct_n

hpx::uninitialized_fill

std::uninitialized_fill

hpx::uninitialized_fill_n

std::uninitialized_fill_n

hpx::uninitialized_move

std::uninitialized_move

hpx::uninitialized_move_n

std::uninitialized_move_n

hpx::uninitialized_value_construct

std::uninitialized_value_construct

hpx::uninitialized_value_construct_n

std::uninitialized_value_construct_n

Table 150 hpx::ranges functions of header hpx/memory.hpp#

hpx::ranges function

C++ standard

hpx::ranges::uninitialized_copy

std::uninitialized_copy

hpx::ranges::uninitialized_copy_n

std::uninitialized_copy_n

hpx::ranges::uninitialized_default_construct

std::uninitialized_default_construct

hpx::ranges::uninitialized_default_construct_n

std::uninitialized_default_construct_n

hpx::ranges::uninitialized_fill

std::uninitialized_fill

hpx::ranges::uninitialized_fill_n

std::uninitialized_fill_n

hpx::ranges::uninitialized_move

std::uninitialized_move

hpx::ranges::uninitialized_move_n

std::uninitialized_move_n

hpx::ranges::uninitialized_value_construct

std::uninitialized_value_construct

hpx::ranges::uninitialized_value_construct_n

std::uninitialized_value_construct_n

hpx/numeric.hpp#

The header hpx/numeric.hpp corresponds to the C++ standard library header numeric. See Using parallel algorithms for more information about the parallel algorithms.

Functions#
Table 151 hpx functions of header hpx/numeric.hpp#

hpx function

C++ standard

hpx::adjacent_difference

std::adjacent_difference

hpx::exclusive_scan

std::exclusive_scan

hpx::inclusive_scan

std::inclusive_scan

hpx::reduce

std::reduce

hpx::transform_exclusive_scan

std::transform_exclusive_scan

hpx::transform_inclusive_scan

std::transform_inclusive_scan

hpx::transform_reduce

std::transform_reduce

Table 152 hpx::ranges functions of header hpx/numeric.hpp#

hpx::ranges function

hpx::ranges::adjacent_difference

hpx::ranges::exclusive_scan

hpx::ranges::inclusive_scan

hpx::ranges::reduce

hpx::ranges::transform_exclusive_scan

hpx::ranges::transform_inclusive_scan

hpx::ranges::transform_reduce

hpx/optional.hpp#

The header hpx/optional.hpp corresponds to the C++ standard library header optional. hpx::optional is compatible with std::optional.

Constants#
  • hpx::nullopt

Classes#
Table 153 Classes of header hpx/optional.hpp#

Class

C++ standard

hpx::optional

std::optional

hpx::nullopt_t

std::nullopt_t

hpx::bad_optional_access

std::bad_optional_access

hpx/runtime.hpp#

The header hpx/runtime.hpp contains functions for accessing local and distributed runtime information.

Typedefs#
Table 154 Typedefs of header hpx/runtime.hpp#

Typedef

hpx::startup_function_type

hpx::shutdown_function_type

Functions#
Table 155 Functions of header hpx/runtime.hpp#

Function

hpx::find_root_locality

hpx::find_all_localities

hpx::find_remote_localities

hpx::find_locality

hpx::get_colocation_id

hpx::get_locality_id

hpx::get_num_worker_threads

hpx::get_worker_thread_num

hpx::get_thread_name

hpx::register_pre_startup_function

hpx::register_startup_function

hpx::register_pre_shutdown_function

hpx::register_shutdown_function

hpx::get_num_localities

hpx::get_locality_name

hpx/experimental/scope.hpp#

The header hpx/experimental/scope.hpp corresponds to the C++ standard library header experimental/scope.

Classes#
Table 156 Classes of header hpx/scope.hpp#

Class

C++ standard

hpx::experimental::scope_exit

std::scope_exit

hpx::experimental::scope_fail

std::scope_fail

hpx::experimental::scope_success

std::scope_success

hpx/semaphore.hpp#

The header hpx/semaphore.hpp corresponds to the C++ standard library header semaphore.

Classes#
Table 157 Classes of header hpx/semaphore.hpp#

Class

C++ standard

hpx::binary_semaphore

std::counting_semaphore

hpx::counting_semaphore

std::counting_semaphore

hpx/shared_mutex.hpp#

The header hpx/shared_mutex.hpp corresponds to the C++ standard library header shared_mutex.

Classes#
Table 158 Classes of header hpx/shared_mutex.hpp#

Class

C++ standard

hpx::shared_mutex

std::shared_mutex

hpx/source_location.hpp#

The header hpx/source_location.hpp corresponds to the C++ standard library header source_location.

Classes#
Table 159 Classes of header hpx/system_error.hpp#

Class

C++ standard

hpx::source_location

std::source_location

hpx/stop_token.hpp#

The header hpx/stop_token.hpp corresponds to the C++ standard library header stop_token.

Constants#
Table 160 Constants of header hpx/stop_token.hpp#

Constant

C++ standard

hpx::nostopstate

std::nostopstate

Classes#
Table 161 Classes of header hpx/stop_token.hpp#

Class

C++ standard

hpx::stop_callback

std::stop_callback

hpx::stop_source

std::stop_source

hpx::stop_token

std::stop_token

hpx::nostopstate_t

std::nostopstate_t

hpx/system_error.hpp#

The header hpx/system_error.hpp corresponds to the C++ standard library header system_error.

Classes#
Table 162 Classes of header hpx/system_error.hpp#

Class

C++ standard

hpx::error_code

std::error_code

hpx/task_block.hpp#

The header hpx/task_block.hpp corresponds to the task_block feature in N4755. See using_task_block for more details on using task blocks.

Classes#
Table 163 Classes of header hpx/task_block.hpp#

Class

hpx::experimental::task_canceled_exception

hpx::experimental::task_block

Functions#
Table 164 Functions of header hpx/task_block.hpp#

Function

hpx::experimental::define_task_block

hpx::experimental::define_task_block_restore_thread

hpx/experimental/task_group.hpp#

The header hpx/experimental/task_group.hpp corresponds to the task_group feature in oneAPI Threading Building Blocks (oneTBB).

Classes#
Table 165 Classes of header hpx/experimental/task_group.hpp#

Class

hpx::experimental::task_group

hpx/thread.hpp#

The header hpx/thread.hpp corresponds to the C++ standard library header thread. The functionality in this header is equivalent to the standard library thread functionality, with the exception that the HPX equivalents are implemented on top of lightweight threads and the HPX runtime.

Classes#
Table 166 Classes of header hpx/thread.hpp#

Class

C++ standard

hpx::thread

std::thread

hpx::jthread

std::jthread

Functions#
Table 167 Functions of header hpx/thread.hpp#

Function

C++ standard

hpx::this_thread::yield

std::yield

hpx::this_thread::get_id

std::get_id

hpx::this_thread::sleep_for

std::sleep_for

hpx::this_thread::sleep_until

std::sleep_until

hpx/tuple.hpp#

The header hpx/tuple.hpp corresponds to the C++ standard library header tuple. hpx::tuple can be used in CUDA device code, unlike std::tuple.

Constants#
Table 168 Constants of header hpx/tuple.hpp#

Constant

C++ standard

hpx::ignore

std::ignore

Classes#
Table 169 Classes of header hpx/tuple.hpp#

Class

C++ standard

hpx::tuple

std::tuple

hpx::tuple_size

std::tuple_size

hpx::tuple_element

std::tuple_element

Functions#
Table 170 Functions of header hpx/tuple.hpp#

Function

C++ standard

hpx::make_tuple

std::tuple_element

hpx::tie

std::tie

hpx::forward_as_tuple

std::forward_as_tuple

hpx::tuple_cat

std::tuple_cat

hpx::get

std::get

hpx/type_traits.hpp#

The header hpx/type_traits.hpp corresponds to the C++ standard library header type_traits.

Classes#
Table 171 Classes of header hpx/type_traits.hpp#

Class

C++ standard

hpx::is_invocable

std::is_invocable

hpx::is_invocable_r

std::is_invocable

hpx/unwrap.hpp#

The header hpx/unwrap.hpp contains utilities for unwrapping futures.

Classes#
Table 172 Classes of header hpx/unwrap.hpp#

Class

hpx::functional::unwrap

hpx::functional::unwrap_n

hpx::functional::unwrap_all

Functions#
Table 173 Functions of header hpx/unwrap.hpp#

Function

hpx::unwrap

hpx::unwrap_n

hpx::unwrap_all

hpx::unwrapping

hpx::unwrapping_n

hpx::unwrapping_all

hpx/version.hpp#

The header hpx/version.hpp provides version information about HPX.

Macros#
Table 174 Macros of header hpx/version.hpp#

Macro

HPX_VERSION_MAJOR

HPX_VERSION_MINOR

HPX_VERSION_SUBMINOR

HPX_VERSION_FULL

HPX_VERSION_DATE

HPX_VERSION_TAG

HPX_AGAS_VERSION

Functions#
Table 175 Functions of header hpx/version.hpp#

Function

hpx::major_version

hpx::minor_version

hpx::subminor_version

hpx::full_version

hpx::full_version_as_string

hpx::tag

hpx::agas_version

hpx::build_type

hpx::build_date_time

hpx/wrap_main.hpp#

The header hpx/wrap_main.hpp does not provide any direct functionality but is used for implicitly using main as the runtime entry point. See Re-use the main() function as the main HPX entry point for more details on implicitly starting the HPX runtime.

Public distributed API#

Our Public Distributed API offers a rich set of tools and functions that enable developers to harness the full potential of distributed computing. Here, you’ll find a comprehensive list of header files, classes and functions for various distributed computing features provided by HPX.

hpx/barrier.hpp#

The header hpx/barrier.hpp includes a distributed barrier implementation. For information regarding the C++ standard library header barrier, see Public API.

Classes#
Table 176 Distributed implementation of classes of header hpx/barrier.hpp#

Class

hpx::distributed::barrier

Functions#
Table 177 hpx functions of header hpx/barrier.hpp#

Function

hpx::distributed::wait

hpx::distributed::synchronize

hpx/collectives.hpp#

The header hpx/collectives.hpp contains definitions and implementations related to the collectives operations.

Classes#
Table 178 hpx classes of header hpx/collectives.hpp#

Class

hpx::collectives::num_sites_arg

hpx::collectives::this_site_arg

hpx::collectives::that_site_arg

hpx::collectives::generation_arg

hpx::collectives::root_site_arg

hpx::collectives::tag_arg

hpx::collectives::arity_arg

hpx::collectives::communicator

hpx::collectives::channel_communicator

Functions#
Table 179 hpx functions of header hpx/collectives.hpp#

Function

hpx::collectives::all_gather

hpx::collectives::all_reduce

hpx::collectives::all_to_all

hpx::collectives::broadcast_to

hpx::collectives::broadcast_from

hpx::collectives::create_channel_communicator

hpx::collectives::set

hpx::collectives::get

hpx::collectives::create_communication_set

hpx::collectives::create_communicator

hpx::collectives::create_local_communicator

hpx::collectives::communicator::set_info

hpx::collectives::communicator::get_info

hpx::collectives::communicator::is_root

hpx::collectives::exclusive_scan

hpx::collectives::gather_here

hpx::collectives::gather_there

hpx::collectives::inclusive_scan

hpx::collectives::reduce_here

hpx::collectives::reduce_there

hpx::collectives::scatter_from

hpx::collectives::scatter_to

hpx/latch.hpp#

The header hpx/latch.hpp includes a distributed latch implementation. For information regarding the C++ standard library header latch, see Public API.

Classes#
Table 180 Distributed implementation of classes of header hpx/latch.hpp#

Class

hpx::distributed::latch

Member functions#
Table 181 hpx functions of class hpx::distributed::latch from header hpx/latch.hpp#

Function

hpx::distributed::latch::count_down_and_wait

hpx::distributed::latch::arrive_and_wait

hpx::distributed::latch::count_down

hpx::distributed::latch::is_ready

hpx::distributed::latch::try_wait

hpx::distributed::latch::wait

hpx/async.hpp#

The header hpx/async.hpp includes distributed implementations of hpx::async, hpx::post, hpx::sync, and hpx::dataflow. For information regarding the C++ standard library header, see Public API.

Functions#
Table 182 Distributed implementation of functions of header hpx/async.hpp#

Functions

hpx::async (distributed)

hpx::sync (distributed)

hpx::post (distributed)

hpx::dataflow (distributed)

hpx/components.hpp#

The header hpx/include/components.hpp includes the components implementation. A component in hpx is a C++ class which can be created remotely and for which its member functions can be invoked remotely as well. More information about how components can be defined, created, and used can be found in Writing components. Components and actions includes examples on the accumulator, template accumulator and template function accumulator.

Macros#
Table 183 hpx macros of header hpx/components.hpp#

Macro

HPX_DEFINE_COMPONENT_ACTION

HPX_REGISTER_ACTION_DECLARATION

HPX_REGISTER_ACTION

HPX_REGISTER_COMMANDLINE_MODULE

HPX_REGISTER_COMPONENT

HPX_REGISTER_COMPONENT_MODULE

HPX_REGISTER_STARTUP_MODULE

Classes#
Table 184 hpx classes of header hpx/components.hpp#

Class

hpx::components::client

hpx::components::client_base

hpx::components::component

hpx::components::component_base

hpx::components::component_commandline_base

Functions#
Table 185 hpx functions of header hpx/components.hpp#

Function

hpx::new_

Full API#

The full API of HPX is presented below. The listings for the public API above refer to the full documentation below.

Note

Most names listed in the full API reference are implementation details or considered unstable. They are listed mostly for completeness. If there is a particular feature you think deserves being in the public API we may consider promoting it. In general we prioritize making sure features corresponding to C++ standard library features are stable and complete.

algorithms#

See Public API for a list of names and headers that are part of the public HPX API.

hpx::experimental::run_on_all#

Defined in header hpx/task_block.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace experimental

Top-level namespace.

Functions

template<typename ExPolicy, typename T, typename... Ts> HPX_CXX_EXPORT requires (hpx::is_execution_policy_v< ExPolicy >) decltype(auto) run_on_all(ExPolicy &&policy

Run a function on all available worker threads with reduction support using the given execution policy

Template Parameters
  • ExPolicy – The execution policy type

  • T – The first type in a list of reduction types and the function type to invoke (last argument)

  • Ts – The list of reduction types and the function type to invoke (last argument)

Parameters
  • policy – The execution policy to use

  • t – The first in a list of reductions and the function to invoke (last argument)

  • ts – The list of reductions and the function to invoke (last argument)

template<typename T, typename... Ts> HPX_CXX_EXPORT requires (!hpx::is_execution_policy_v< T >) decltype(auto) run_on_all(T &&t

Run a function on all available worker threads with reduction support using the hpx::execution::par execution policy

Template Parameters
  • T – The first type in a list of reduction types and the function type to invoke (last argument)

  • Ts – The list of reduction types and the function type to invoke (last argument)

Parameters
  • t – The first in a list of reductions and the function to invoke (last argument)

  • ts – The list of reductions and the function to invoke (last argument)

Variables

HPX_CXX_EXPORT T && t
HPX_CXX_EXPORT T Ts && ts  {return detail::run_on_all(HPX_FORWARD(ExPolicy, policy),hpx::util::make_index_pack_t<sizeof...(Ts)>(), HPX_FORWARD(T, t),HPX_FORWARD(Ts, ts)...)
hpx::experimental::task_canceled_exception, hpx::experimental::task_block, hpx::experimental::define_task_block, hpx::experimental::define_task_block_restore_thread#

Defined in header hpx/task_block.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace experimental

Top-level namespace.

Functions

template<typename ExPolicy, typename F> HPX_CXX_EXPORT requires (hpx::is_execution_policy_v< std::decay_t< ExPolicy >>) decltype(auto) define_task_block(ExPolicy &&policy

Constructs a task_block, tr, using the given execution policy policy,and invokes the expression f(tr) on the user-provided object, f.

Postcondition: All tasks spawned from f have finished execution. A call to define_task_block may return on a different thread than that on which it was called.

Note

It is expected (but not mandated) that f will (directly or indirectly) call tr.run(callable_object).

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the task block may be parallelized.

  • F – The type of the user defined function to invoke inside the define_task_block (deduced). F shall be MoveConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • f – The user defined function to invoke inside the task block. Given an lvalue tr of type task_block, the expression, (void)f(tr), shall be well-formed.

Throws

exception_list – specified in Exception Handling.

template<typename ExPolicy = hpx::execution::parallel_policy>
class task_block#
#include <task_block.hpp>

The class task_block defines an interface for forking and joining parallel tasks. The define_task_block and define_task_block_restore_thread function templates create an object of type task_block and pass a reference to that object to a user-provided callable object.

An object of class task_block cannot be constructed, destroyed, copied, or moved except by the implementation of the task region library. Taking the address of a task_block object via operator& or addressof is ill formed. The result of obtaining its address by any other means is unspecified.

A task_block

is active if it was created by the nearest enclosing task block, where “task block” refers to an invocation of define_task_block or define_task_block_restore_thread and “nearest

enclosing” means the most recent invocation that has not yet completed. Code designated for execution in another thread by means other than the facilities in this section (e.g., using thread or async) are not enclosed in the task region and a

task_block passed to (or captured by) such code is not active within that code. Performing any operation on a task_block that is not active results in undefined behavior.

The task_block that is active before a specific call to the run member function is not active within the asynchronous function that invoked run. (The invoked function should not, therefore, capture the task_block from the surrounding block.)

Example:
   define_task_block([&](auto& tr) {
       tr.run([&] {
           tr.run([] { f(); });              // Error: tr is not active
           define_task_block([&](auto& tr) { // Nested task block
               tr.run(f);                    // OK: inner tr is active
               /// ...
           });
       }); /// ...
   });
Template Parameters

ExPolicy – The execution policy an instance of a task_block was created with. This defaults to parallel_policy.

Public Types

using execution_policy = ExPolicy#

Refers to the type of the execution policy used to create the task_block.

Public Functions

inline constexpr execution_policy const &get_execution_policy() const noexcept#

Return the execution policy instance used to create this task_block

template<typename F, typename ...Ts>
inline void run(F &&f, Ts&&... ts)#

Causes the expression f() to be invoked asynchronously. The invocation of f is permitted to run on an unspecified thread in an unordered fashion relative to the sequence of operations following the call to run(f) (the continuation), or indeterminately sequenced within the same thread as the continuation.

The call to run synchronizes with the invocation of f. The completion of f() synchronizes with the next invocation of wait on the same task_block or completion of the nearest enclosing task block (i.e., the define_task_block or define_task_block_restore_thread that created this task block).

Requires: F shall be MoveConstructible. The expression, (void)f(), shall be well-formed.

Precondition: this shall be the active task_block.

Postconditions: A call to run may return on a different thread than that on which it was called.

Note

The call to run is sequenced before the continuation as if run returns on the same thread. The invocation of the user-supplied callable object f may be immediate or may be delayed until compute resources are available. run might or might not return before invocation of f completes.

Throws

task_canceled_exception – described in Exception Handling.

template<typename Executor, typename F, typename ...Ts>
inline void run(Executor &&exec, F &&f, Ts&&... ts)#

Causes the expression f() to be invoked asynchronously using the given executor. The invocation of f is permitted to run on an unspecified thread associated with the given executor and in an unordered fashion relative to the sequence of operations following the call to run(exec, f) (the continuation), or indeterminately sequenced within the same thread as the continuation.

The call to run synchronizes with the invocation of f. The completion of f() synchronizes with the next invocation of wait on the same task_block or completion of the nearest enclosing task block (i.e., the define_task_block or define_task_block_restore_thread that created this task block).

Requires: Executor shall be a type modeling the Executor concept. F shall be MoveConstructible. The expression, (void)f(), shall be well-formed.

Precondition: this shall be the active task_block.

Postconditions: A call to run may return on a different thread than that on which it was called.

Note

The call to run is sequenced before the continuation as if run returns on the same thread. The invocation of the user-supplied callable object f may be immediate or may be delayed until compute resources are available. run might or might not return before invocation of f completes.

Throws

task_canceled_exception – described in Exception Handling. The function will also throw an exception_list holding all exceptions that were caught while executing the tasks.

inline void wait()#

Blocks until the tasks spawned using this task_block have finished.

Precondition: this shall be the active task_block.

Postcondition: All tasks spawned by the nearest enclosing task region have finished. A call to wait may return on a different thread than that on which it was called.

Example:
   define_task_block([&](auto& tr) {
       tr.run([&]{ process(a, w, x); }); // Process a[w] through a[x]
       if (y < x) tr.wait();   // Wait if overlap between [w, x) and [y, z)
       process(a, y, z);       // Process a[y] through a[z]
   });

Note

The call to wait is sequenced before the continuation as if wait returns on the same thread.

Throws

This – function may throw task_canceled_exception, as described in Exception Handling. The function will also throw a exception_list holding all exceptions that were caught while executing the tasks.

inline ExPolicy &policy() noexcept#

Returns a reference to the execution policy used to construct this object.

Precondition: this shall be the active task_block.

inline constexpr ExPolicy const &policy() const noexcept#

Returns a reference to the execution policy used to construct this object.

Precondition: this shall be the active task_block.

Private Members

hpx::experimental::task_group tasks_#
threads::thread_id_type id_#
ExPolicy policy_#
class task_canceled_exception : public exception#
#include <task_block.hpp>

The class task_canceled_exception defines the type of objects thrown by task_block::run or task_block::wait if they detect that an exception is pending within the current parallel region.

Public Functions

inline task_canceled_exception() noexcept#
namespace parallel

Typedefs

typedef hpx::experimental::task_canceled_exception instead#

Functions

template<typename ExPolicy, typename F>  requires (hpx::is_async_execution_policy_v< std::decay_t< ExPolicy >>) HPX_DEPRECATED_V(1
hpx::parallel use hpx::experimental::define_task_block instead hpx::future< void > define_task_block (ExPolicy &&policy, F &&f)
template<typename ExPolicy, typename F>  requires (!hpx::is_async_execution_policy_v< std::decay_t< ExPolicy >>) HPX_DEPRECATED_V(1
template<typename F>  HPX_DEPRECATED_V (1, 9, "hpx::parallel:v2::define_task_block is deprecated, use " "hpx::experimental::define_task_block instead") void define_task_block(F &&f)
template<typename ExPolicy, typename F>  HPX_DEPRECATED_V (1, 9, "hpx::parallel:v2::define_task_block is deprecated, use " "hpx::experimental::define_task_block instead") util

Variables

hpx::parallel __pad0__#
hpx::parallel __pad1__#
hpx::experimental::task_group#

Defined in header hpx/experimental/task_group.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace execution#
namespace experimental#

Typedefs

using instead = hpx::experimental::task_group#
namespace experimental

Top-level namespace.

class task_group#
#include <task_group.hpp>

A task_group represents concurrent execution of a group of tasks. Tasks can be dynamically added to the group while it is executing.

Public Functions

task_group()#
~task_group()#
task_group(task_group const&) = delete#
task_group(task_group&&) = delete#
task_group &operator=(task_group const&) = delete#
task_group &operator=(task_group&&) = delete#
template<typename Executor, typename F, typename... Ts>  requires (hpx::traits::is_executor_any_v< std::decay_t< Executor >>) void run(Executor &&exec

Adds a task to compute f() and returns immediately.

Template Parameters
  • Executor – The type of the executor to associate with this execution policy.

  • F – The type of the user defined function to invoke.

  • Ts – The type of additional arguments used to invoke f().

Parameters
  • exec – The executor to use for the execution of the parallel algorithm the returned execution policy is used with.

  • f – The user defined function to invoke inside the task group.

  • ts – Additional arguments to use to invoke f().

hpx::parallel::execution::post (HPX_FORWARD(Executor, exec), [this, on_exit=HPX_MOVE(on_exit), f=HPX_FORWARD(F, f), t=hpx::make_tuple(HPX_FORWARD(Ts, ts)...)]() mutable { auto _(HPX_MOVE(on_exit));hpx::detail::try_catch_exception_ptr([&]() { hpx::invoke_fused(HPX_MOVE(f), HPX_MOVE(t));}, [this](std::exception_ptr e) { add_exception(HPX_MOVE(e));});})
template<typename F, typename... Ts>  requires (!hpx::traits::is_executor_any_v< std::decay_t< F >>) void run(F &&f

Adds a task to compute f() and returns immediately.

Template Parameters
  • F – The type of the user defined function to invoke.

  • Ts – The type of additional arguments used to invoke f().

Parameters
  • f – The user defined function to invoke inside the task group.

  • ts – Additional arguments to use to invoke f().

void wait()#

Waits for all tasks in the group to complete or be cancelled.

void add_exception(std::exception_ptr p)#

Adds an exception to this task_group.

Public Members

F &&f#
F Ts && ts  {if (latch_.reset_if_needed_and_count_up(1, 1)){has_arrived_.store(false, std::memory_order_release);}auto on_exit =hpx::experimental::scope_exit([this] { latch_.count_down(1); })
Ts && ts  {run(execution::parallel_executor{}, HPX_FORWARD(F, f),HPX_FORWARD(Ts, ts)...)

Private Types

using shared_state_type = lcos::detail::future_data<void>#

Private Functions

void serialize(serialization::output_archive&, unsigned const)#

Private Members

hpx::lcos::local::latch latch_#
hpx::intrusive_ptr<shared_state_type> state_#
hpx::exception_list errors_#
std::atomic<bool> has_arrived_#

Private Static Functions

static inline constexpr void serialize(serialization::input_archive&, unsigned const) noexcept#

Friends

friend class serialization::access
hpx::adjacent_difference#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx#

Functions

template<typename FwdIter1, typename FwdIter2>
FwdIter2 adjacent_difference(FwdIter1 first, FwdIter1 last, FwdIter2 dest)#

Assigns each value in the range given by result its corresponding element in the range [first, last] and the one preceding it except *result, which is assigned *first.

Note

Complexity: Exactly (last - first) - 1 application of the binary operator and (last - first) assignments.

Template Parameters
  • FwdIter1 – The type of the source iterators used for the input range (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the source iterators used for the output range (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements of the range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the range the algorithm will be applied to.

  • dest – Refers to the beginning of the sequence of elements the results will be assigned to.

Returns

The adjacent_difference algorithm returns a FwdIter2. The adjacent_difference algorithm returns an iterator to the element past the last element written.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> adjacent_difference(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest)#

Assigns each value in the range given by result its corresponding element in the range [first, last] and the one preceding it except *result, which is assigned *first. Executed according to the policy.

The difference operations in the parallel adjacent_difference invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The difference operations in the parallel adjacent_difference invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Exactly (last - first) - 1 application of the binary operator and (last - first) assignments.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used for the input range (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the source iterators used for the output range (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of the range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the range the algorithm will be applied to.

  • dest – Refers to the beginning of the sequence of elements the results will be assigned to.

Returns

The adjacent_difference algorithm returns a hpx::future<FwdIter2> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter2 otherwise. The adjacent_difference algorithm returns an iterator to the element past the last element written.

template<typename FwdIter1, typename FwdIter2, typename Op>
FwdIter2 adjacent_difference(FwdIter1 first, FwdIter1 last, FwdIter2 dest, Op &&op)#

Assigns each value in the range given by result its corresponding element in the range [first, last] and the one preceding it except *result, which is assigned *first

Note

Complexity: Exactly (last - first) - 1 application of the binary operator and (last - first) assignments.

Template Parameters
  • FwdIter1 – The type of the source iterators used for the input range (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the source iterators used for the output range (deduced). This iterator type must meet the requirements of a forward iterator.

  • Op – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of adjacent_difference requires Op to meet the requirements of CopyConstructible.

Parameters
  • first – Refers to the beginning of the sequence of elements of the range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the range the algorithm will be applied to.

  • dest – Refers to the beginning of the sequence of elements the results will be assigned to.

  • op – The binary operator which returns the difference of elements. The signature should be equivalent to the following:

    bool op(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 must be such that objects of type FwdIter1 can be dereferenced and then implicitly converted to the dereferenced type of dest.

Returns

The adjacent_difference algorithm returns FwdIter2. The adjacent_difference algorithm returns an iterator to the element past the last element written.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename Op>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> adjacent_difference(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest, Op &&op)#

Assigns each value in the range given by result its corresponding element in the range [first, last] and the one preceding it except *result, which is assigned *first

The difference operations in the parallel adjacent_difference invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The difference operations in the parallel adjacent_difference invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Exactly (last - first) - 1 application of the binary operator and (last - first) assignments.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used for the input range (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the source iterators used for the output range (deduced). This iterator type must meet the requirements of a forward iterator.

  • Op – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of adjacent_difference requires Op to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of the range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the range the algorithm will be applied to.

  • dest – Refers to the beginning of the sequence of elements the results will be assigned to.

  • op – The binary operator which returns the difference of elements. The signature should be equivalent to the following:

    bool op(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 must be such that objects of type FwdIter1 can be dereferenced and then implicitly converted to the dereferenced type of dest.

Returns

The adjacent_difference algorithm returns a hpx::future<FwdIter2> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter2 otherwise. The adjacent_difference algorithm returns an iterator to the element past the last element written.

hpx::adjacent_find#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename InIter, typename Pred = hpx::parallel::detail::equal_to>
InIter adjacent_find(InIter first, InIter last, Pred &&pred = Pred())#

Searches the range [first, last) for two consecutive identical elements.

Note

Complexity: Exactly the smaller of (result - first) + 1 and (last - first) - 1 application of the predicate where result is the value returned

Template Parameters
  • InIter – The type of the source iterators used for the range (deduced). This iterator type must meet the requirements of an input iterator.

  • Pred – The type of an optional function/function object to use.

Parameters
  • first – Refers to the beginning of the sequence of elements of the range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the range the algorithm will be applied to.

  • pred – The binary predicate which returns true if the elements should be treated as equal. The signature should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 must be such that objects of type InIter can be dereferenced and then implicitly converted to Type1 .

Returns

The adjacent_find algorithm returns an iterator to the first of the identical elements. If no such elements are found, last is returned.

template<typename ExPolicy, typename FwdIter, typename Pred = hpx::parallel::detail::equal_to>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter> adjacent_find(ExPolicy &&policy, FwdIter first, FwdIter last, Pred &&pred = Pred())#

Searches the range [first, last) for two consecutive identical elements. This version uses the given binary predicate pred

The comparison operations in the parallel adjacent_find invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel adjacent_find invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

This overload of adjacent_find is available if the user decides to provide their algorithm their own binary predicate pred.

Note

Complexity: Exactly the smaller of (result - first) + 1 and (last - first) - 1 application of the predicate where result is the value returned

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used for the range (deduced). This iterator type must meet the requirements of a forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of adjacent_find requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of the range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the range the algorithm will be applied to.

  • pred – The binary predicate which returns true if the elements should be treated as equal. The signature should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1 .

Returns

The adjacent_find algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The adjacent_find algorithm returns an iterator to the first of the identical elements. If no such elements are found, last is returned.

hpx::all_of, hpx::any_of, hpx::none_of#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename ExPolicy, typename FwdIter, typename F>
util::detail::algorithm_result_t<ExPolicy, bool> none_of(ExPolicy &&policy, FwdIter first, FwdIter last, F &&f)#

Checks if unary predicate f returns true for no elements in the range [first, last).

The application of function objects in parallel algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most last - first applications of the predicate f

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of none_of requires F to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

Returns

The none_of algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The none_of algorithm returns true if the unary predicate f returns true for no elements in the range, false otherwise. It returns true if the range is empty.

template<typename InIter, typename F>
bool none_of(InIter first, InIter last, F &&f)#

Checks if unary predicate f returns true for no elements in the range [first, last).

Note

Complexity: At most last - first applications of the predicate f

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of none_of requires F to meet the requirements of CopyConstructible.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type InIter can be dereferenced and then implicitly converted to Type.

Returns

The none_of algorithm returns a bool . The none_of algorithm returns true if the unary predicate f returns true for no elements in the range, false otherwise. It returns true if the range is empty.

template<typename ExPolicy, typename FwdIter, typename F>
util::detail::algorithm_result_t<ExPolicy, bool> any_of(ExPolicy &&policy, FwdIter first, FwdIter last, F &&f)#

Checks if unary predicate f returns true for at least one element in the range [first, last).

The application of function objects in parallel algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most last - first applications of the predicate f

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of any_of requires F to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

Returns

The any_of algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The any_of algorithm returns true if the unary predicate f returns true for at least one element in the range, false otherwise. It returns false if the range is empty.

template<typename InIter, typename F>
bool any_of(InIter first, InIter last, F &&f)#

Checks if unary predicate f returns true for at least one element in the range [first, last).

Note

Complexity: At most last - first applications of the predicate f

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of any_of requires F to meet the requirements of CopyConstructible.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type InIter can be dereferenced and then implicitly converted to Type.

Returns

The any_of algorithm returns a bool . The any_of algorithm returns true if the unary predicate f returns true for at least one element in the range, false otherwise. It returns false if the range is empty.

template<typename ExPolicy, typename FwdIter, typename F>
util::detail::algorithm_result_t<ExPolicy, bool> all_of(ExPolicy &&policy, FwdIter first, FwdIter last, F &&f)#

Checks if unary predicate f returns true for all elements in the range [first, last).

The application of function objects in parallel algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most last - first applications of the predicate f

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of all_of requires F to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

Returns

The all_of algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The all_of algorithm returns true if the unary predicate f returns true for all elements in the range, false otherwise. It returns true if the range is empty.

template<typename ExPolicy, typename InIter, typename F>
bool all_of(InIter first, InIter last, F &&f)#

Checks if unary predicate f returns true for all elements in the range [first, last).

Note

Complexity: At most last - first applications of the predicate f

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of all_of requires F to meet the requirements of CopyConstructible.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type InIter can be dereferenced and then implicitly converted to Type.

Returns

The all_of algorithm returns a bool . The all_of algorithm returns true if the unary predicate f returns true for all elements in the range, false otherwise. It returns true if the range is empty.

hpx::copy, hpx::copy_n, hpx::copy_if#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename ExPolicy, typename FwdIter1, typename FwdIter2>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> copy(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest)#

Copies the elements in the range, defined by [first, last), to another range beginning at dest. Executed according to the policy.

The assignments in the parallel copy algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel copy algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The copy algorithm returns a hpx::future<FwdIter2> > if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter2> otherwise. The copy algorithm returns the pair of the input iterator last and the output iterator to the element in the destination range, one past the last element copied.

template<typename FwdIter1, typename FwdIter2>
FwdIter2 copy(FwdIter1 first, FwdIter1 last, FwdIter2 dest)#

Copies the elements in the range, defined by [first, last), to another range beginning at dest.

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The copy algorithm returns a FwdIter2 . The copy algorithm returns the pair of the input iterator last and the output iterator to the element in the destination range, one past the last element copied.

template<typename ExPolicy, typename FwdIter1, typename Size, typename FwdIter2>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> copy_n(ExPolicy &&policy, FwdIter1 first, Size count, FwdIter2 dest)#

Copies the elements in the range [first, first + count), starting from first and proceeding to first + count - 1., to another range beginning at dest. Executed according to the policy.

The assignments in the parallel copy_n algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel copy_n algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly count assignments, if count > 0, no assignments otherwise.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Size – The type of the argument specifying the number of elements to apply f to.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The copy_n algorithm returns a hpx::future<FwdIter2> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter2 otherwise. The copy_n algorithm returns Iterator in the destination range, pointing past the last element copied if count>0 or result otherwise.

template<typename FwdIter1, typename Size, typename FwdIter2>
FwdIter2 copy_n(FwdIter1 first, Size count, FwdIter2 dest)#

Copies the elements in the range [first, first + count), starting from first and proceeding to first + count - 1., to another range beginning at dest.

Note

Complexity: Performs exactly count assignments, if count > 0, no assignments otherwise.

Template Parameters
  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Size – The type of the argument specifying the number of elements to apply f to.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The copy_n algorithm returns a FwdIter2 . The copy_n algorithm returns Iterator in the destination range, pointing past the last element copied if count>0 or result otherwise.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename Pred>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> copy_if(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest, Pred &&pred)#

Copies the elements in the range, defined by [first, last), to another range beginning at dest. Copies only the elements for which the predicate f returns true. The order of the elements that are not removed is preserved. Executed according to the policy.

The assignments in the parallel copy_if algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel copy_if algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs not more than last - first assignments, exactly last - first applications of the predicate f.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of copy_if requires F to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • pred – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last).This is an unary predicate which returns true for the required elements. The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter1 can be dereferenced and then implicitly converted to Type.

Returns

The copy_if algorithm returns a hpx::future<FwdIter2> > if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter2 otherwise. The copy_if algorithm returns output iterator to the element in the destination range, one past the last element copied.

template<typename FwdIter1, typename FwdIter2, typename Pred>
FwdIter2 copy_if(FwdIter1 first, FwdIter1 last, FwdIter2 dest, Pred &&pred)#

Copies the elements in the range, defined by [first, last), to another range beginning at dest. Copies only the elements for which the predicate f returns true. The order of the elements that are not removed is preserved.

Note

Complexity: Performs not more than last - first assignments, exactly last - first applications of the predicate f.

Template Parameters
  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of copy_if requires F to meet the requirements of CopyConstructible.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • pred – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last).This is an unary predicate which returns true for the required elements. The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter1 can be dereferenced and then implicitly converted to Type.

Returns

The copy_if algorithm returns a FwdIter2 . The copy_if algorithm returns output iterator to the element in the destination range, one past the last element copied.

hpx::count, hpx::count_if#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename ExPolicy, typename FwdIter, typename T>
util::detail::algorithm_result<ExPolicy, typename std::iterator_traits<FwdIter>::difference_type>::type count(ExPolicy &&policy, FwdIter first, FwdIter last, T const &value)#

Returns the number of elements in the range [first, last) satisfying a specific criteria. This version counts the elements that are equal to the given value. Executed according to the policy.

The comparisons in the parallel count algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

Note

Complexity: Performs exactly last - first comparisons.

Note

The comparisons in the parallel count algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the comparisons.

  • FwdIter – The type of the source iterator used (deduced). This iterator type must meet the requirements of an forward iterator.

  • T – The type of the value to search for (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • value – The value to search for.

Returns

The count algorithm returns a hpx::future<difference_type> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns difference_type otherwise (where difference_type is defined by std::iterator_traits<FwdIterB>::difference_type. The count algorithm returns the number of elements satisfying the given criteria.

template<typename InIter, typename T>
std::iterator_traits<InIter>::difference_type count(InIter first, InIter last, T const &value)#

Returns the number of elements in the range [first, last) satisfying a specific criteria. This version counts the elements that are equal to the given value.

Note

Complexity: Performs exactly last - first comparisons.

Template Parameters
  • InIter – The type of the source iterator used (deduced). This iterator type must meet the requirements of an input iterator.

  • T – The type of the value to search for (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • value – The value to search for.

Returns

The count algorithm returns a difference_type (where difference_type is defined by std::iterator_traits<InIter>::difference_type. The count algorithm returns the number of elements satisfying the given criteria.

template<typename ExPolicy, typename FwdIter, typename F>
util::detail::algorithm_result<ExPolicy, typename std::iterator_traits<FwdIter>::difference_type>::type count_if(ExPolicy &&policy, FwdIter first, FwdIter last, F &&f)#

Returns the number of elements in the range [first, last) satisfying a specific criteria. This version counts elements for which predicate f returns true. Executed according to the policy.

Note

Complexity: Performs exactly last - first applications of the predicate.

Note

The assignments in the parallel count_if algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

Note

The assignments in the parallel count_if algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the comparisons.

  • FwdIter – The type of the source begin iterator used (deduced). This iterator type must meet the requirements of an forward iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of count_if requires F to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last).This is an unary predicate which returns true for the required elements. The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

Returns

The count_if algorithm returns hpx::future<difference_type> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns difference_type otherwise (where difference_type is defined by std::iterator_traits<FwdIter>::difference_type. The count algorithm returns the number of elements satisfying the given criteria.

template<typename InIter, typename F>
std::iterator_traits<InIter>::difference_type count_if(InIter first, InIter last, F &&f)#

Returns the number of elements in the range [first, last) satisfying a specific criteria. This version counts elements for which predicate f returns true.

Note

Complexity: Performs exactly last - first applications of the predicate.

Template Parameters
  • InIter – The type of the source begin iterator used (deduced). This iterator type must meet the requirements of an input iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of count_if requires F to meet the requirements of CopyConstructible.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last).This is an unary predicate which returns true for the required elements. The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type InIter can be dereferenced and then implicitly converted to Type.

Returns

The count_if algorithm returns difference_type (where a difference_type is defined by std::iterator_traits<InIter>::difference_type. The count algorithm returns the number of elements satisfying the given criteria.

hpx::destroy, hpx::destroy_n#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename ExPolicy, typename FwdIter>
util::detail::algorithm_result_t<ExPolicy> destroy(ExPolicy &&policy, FwdIter first, FwdIter last)#

Destroys objects of type typename iterator_traits<ForwardIt>::value_type in the range [first, last). Executed according to the policy.

The operations in the parallel destroy algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The operations in the parallel destroy algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly last - first operations.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

Returns

The destroy algorithm returns a hpx::future<void>, if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns void otherwise.

template<typename FwdIter>
void destroy(FwdIter first, FwdIter last)#

Destroys objects of type typename iterator_traits<ForwardIt>::value_type in the range [first, last).

Note

Complexity: Performs exactly last - first operations.

Template Parameters

FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

Returns

The destroy algorithm returns a void

template<typename ExPolicy, typename FwdIter, typename Size>
util::detail::algorithm_result_t<ExPolicy, FwdIter> destroy_n(ExPolicy &&policy, FwdIter first, Size count)#

Destroys objects of type typename iterator_traits<ForwardIt>::value_type in the range [first, first + count). Executed according to the policy.

The operations in the parallel destroy_n algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The operations in the parallel destroy_n algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly count operations, if count > 0, no assignments otherwise.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Size – The type of the argument specifying the number of elements to apply this algorithm to.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

Returns

The destroy_n algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The destroy_n algorithm returns the iterator to the element in the source range, one past the last element constructed.

template<typename FwdIter, typename Size>
FwdIter destroy_n(FwdIter first, Size count)#

Destroys objects of type typename iterator_traits<ForwardIt>::value_type in the range [first, first + count).

Note

Complexity: Performs exactly count operations, if count > 0, no assignments otherwise.

Template Parameters
  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Size – The type of the argument specifying the number of elements to apply this algorithm to.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

Returns

The destroy_n algorithm returns a FwdIter . The destroy_n algorithm returns the iterator to the element in the source range, one past the last element constructed.

hpx::ends_with#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename InIter1, typename InIter2, typename Pred>
bool ends_with(InIter1 first1, InIter1 last1, InIter2 first2, InIter2 last2, Pred &&pred)#

Checks whether the second range defined by [first1, last1) matches the suffix of the first range defined by [first2, last2)

The assignments in the parallel ends_with algorithm invoked without an execution policy object execute in sequential order in the calling thread.

Note

Complexity: Linear: at most min(N1, N2) applications of the predicate and both projections.

Template Parameters
  • InIter1 – The type of the begin source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • InIter2 – The type of the begin destination iterators used deduced). This iterator type must meet the requirements of a input iterator.

  • Pred – The binary predicate that compares the projected elements.

Parameters
  • first1 – Refers to the beginning of the source range.

  • last1 – Refers to the end of the source range.

  • first2 – Refers to the beginning of the destination range.

  • last2 – Refers to the end of the destination range.

  • pred – Specifies the binary predicate function (or function object) which will be invoked for comparison of the elements in the in two ranges projected by proj1 and proj2 respectively.

Returns

The ends_with algorithm returns bool. The ends_with algorithm returns a boolean with the value true if the second range matches the suffix of the first range, false otherwise.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename Pred>
hpx::parallel::util::detail::algorithm_result<ExPolicy, bool>::type ends_with(ExPolicy &&policy, FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2, Pred &&pred)#

Checks whether the second range defined by [first1, last1) matches the suffix of the first range defined by [first2, last2). Executed according to the policy.

The assignments in the parallel ends_with algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel ends_with algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Linear: at most min(N1, N2) applications of the predicate and both projections.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the begin source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the begin destination iterators used deduced). This iterator type must meet the requirements of a forward iterator.

  • Pred – The binary predicate that compares the projected elements.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the source range.

  • last1 – Refers to the end of the source range.

  • first2 – Refers to the beginning of the destination range.

  • last2 – Refers to the end of the destination range.

  • pred – Specifies the binary predicate function (or function object) which will be invoked for

Returns

The ends_with algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The ends_with algorithm returns a boolean with the value true if the second range matches the suffix of the first range, false otherwise.

hpx::equal#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename Pred = detail::equal_to>
util::detail::algorithm_result_t<ExPolicy, bool> equal(ExPolicy &&policy, FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2, Pred &&op = Pred())#

Returns true if the range [first1, last1) is equal to the range [first2, last2), and false otherwise. Executed according to the policy.

The comparison operations in the parallel equal algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel equal algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: O(min(last1 - first1, last2 - first2)) applications of the predicate op.

Note

The two ranges are considered equal if, for every iterator i in the range [first1,last1), *i equals *(first2 + (i - first1)). This overload of equal uses operator== to determine if two elements are equal.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of equal requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

  • op – The binary predicate which returns true if the elements should be treated as equal. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively

Returns

The equal algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The equal algorithm returns true if the elements in the two ranges are equal, otherwise it returns false. If the length of the range [first1, last1) does not equal the length of the range [first2, last2), it returns false.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2>
util::detail::algorithm_result_t<ExPolicy, bool> equal(ExPolicy &&policy, FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2)#

Returns true if the range [first1, last1) is equal to the range [first2, last2), and false otherwise. Executed according to policy.

The comparison operations in the parallel equal algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel equal algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: O(min(last1 - first1, last2 - first2)) applications of the predicate std::equal_to.

Note

The two ranges are considered equal if, for every iterator i in the range [first1,last1), *i equals *(first2 + (i - first1)). This overload of equal uses operator== to determine if two elements are equal.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

Returns

The equal algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The equal algorithm returns true if the elements in the two ranges are equal, otherwise it returns false. If the length of the range [first1, last1) does not equal the length of the range [first2, last2), it returns false.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename Pred = detail::equal_to>
util::detail::algorithm_result_t<ExPolicy, bool> equal(ExPolicy &&policy, FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, Pred &&op = Pred())#

Returns true if the range [first1, last1) is equal to the range starting at first2, and false otherwise. Executed according to policy.

The comparison operations in the parallel equal algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel equal algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: O(min(last1 - first1, last2 - first2)) applications of the predicate op.

Note

The two ranges are considered equal if, for every iterator i in the range [first1,last1), *i equals *(first2 + (i - first1)). This overload of equal uses operator== to determine if two elements are equal.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of equal requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • op – The binary predicate which returns true if the elements should be treated as equal. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively

Returns

The equal algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The equal algorithm returns true if the elements in the two ranges are equal, otherwise it returns false.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2>
util::detail::algorithm_result_t<ExPolicy, bool> equal(ExPolicy &&policy, FwdIter1 first1, FwdIter1 last1, FwdIter2 first2)#

Returns true if the range [first1, last1) is equal to the range [first2, last2), and false otherwise. Executed according to policy.

The comparison operations in the parallel equal algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel equal algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most last1 - first1 applications of the predicate op.

Note

The two ranges are considered equal if, for every iterator i in the range [first1,last1), *i equals *(first2 + (i - first1)). This overload of equal uses operator== to determine if two elements are equal.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

Returns

The equal algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The equal algorithm returns true if the elements in the two ranges are equal, otherwise it returns false. If the length of the range [first1, last1) does not equal the length of the range [first2, last2), it returns false.

template<typename FwdIter1, typename FwdIter2, typename Pred = detail::equal_to>
bool equal(FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2, Pred &&op = Pred())#

Returns true if the range [first1, last1) is equal to the range [first2, last2), and false otherwise.

Note

Complexity: At most min(last1 - first1, last2 - first2) applications of the predicate op.

Note

The two ranges are considered equal if, for every iterator i in the range [first1,last1), *i equals *(first2 + (i - first1)). This overload of equal uses operator== to determine if two elements are equal.

Template Parameters
  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of equal requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

Parameters
  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

  • op – The binary predicate which returns true if the elements should be treated as equal. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively

Returns

The equal algorithm returns a bool . The equal algorithm returns true if the elements in the two ranges are equal, otherwise it returns false. If the length of the range [first1, last1) does not equal the length of the range [first2, last2), it returns false.

template<typename FwdIter1, typename FwdIter2>
bool equal(FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2)#

Returns true if the range [first1, last1) is equal to the range [first2, last2), and false otherwise.

Note

Complexity: At most min(last1 - first1, last2 - first2) applications of the predicate std::equal_to.

Note

The two ranges are considered equal if, for every iterator i in the range [first1,last1), *i equals *(first2 + (i - first1)). This overload of equal uses operator== to determine if two elements are equal.

Template Parameters
  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

Returns

The equal algorithm returns a bool . The equal algorithm returns true if the elements in the two ranges are equal, otherwise it returns false. If the length of the range [first1, last1) does not equal the length of the range [first2, last2), it returns false.

template<typename FwdIter1, typename FwdIter2, typename Pred = detail::equal_to>
bool equal(FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, Pred &&op = Pred())#

Returns true if the range [first1, last1) is equal to the range [first2, first2 + (last1 - first1)), and false otherwise.

Note

Complexity: At most last1 - first1 applications of the predicate op.

Note

The two ranges are considered equal if, for every iterator i in the range [first1,last1), *i equals *(first2 + (i - first1)). This overload of equal uses operator== to determine if two elements are equal.

Template Parameters
  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of equal requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

Parameters
  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • op – The binary predicate which returns true if the elements should be treated as equal. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively

Returns

The equal algorithm returns a bool . The equal algorithm returns true if the elements in the two ranges are equal, otherwise it returns false. If the length of the range [first1, last1) does not equal the length of the range [first2, last2), it returns false.

hpx::exclusive_scan#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename InIter, typename OutIter, typename T>
OutIter exclusive_scan(InIter first, InIter last, OutIter dest, T init)#

Assigns through each iterator i in [result, result + (last - first)) the value of GENERALIZED_NONCOMMUTATIVE_SUM(+, init, *first, …, *(first + (i - result) - 1))

The reduce operations in the parallel exclusive_scan algorithm invoked without an execution policy object will execute in sequential order in the calling thread.

The difference between exclusive_scan and inclusive_scan is that inclusive_scan includes the ith input element in the ith sum.

Note

Complexity: O(last - first) applications of the predicate std::plus<T>.

Note

GENERALIZED_NONCOMMUTATIVE_SUM(+, a1, …, aN) is defined as:

  • a1 when N is 1

  • GENERALIZED_NONCOMMUTATIVE_SUM(+, a1, …, aK)

    • GENERALIZED_NONCOMMUTATIVE_SUM(+, aM, …, aN) where 1 < K+1 = M <= N.

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • OutIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an output iterator.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • init – The initial value for the generalized sum.

Returns

The exclusive_scan algorithm returns OutIter. The exclusive_scan algorithm returns the output iterator to the element in the destination range, one past the last element copied.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename T>
util::detail::algorithm_result_t<ExPolicy, FwdIter2> exclusive_scan(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest, T init)#

Assigns through each iterator i in [result, result + (last - first)) the value of GENERALIZED_NONCOMMUTATIVE_SUM(+, init, *first, …, *(first + (i - result) - 1))

The reduce operations in the parallel exclusive_scan algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The reduce operations in the parallel exclusive_scan algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

The difference between exclusive_scan and inclusive_scan is that inclusive_scan includes the ith input element in the ith sum.

Note

Complexity: O(last - first) applications of the predicate std::plus<T>.

Note

GENERALIZED_NONCOMMUTATIVE_SUM(+, a1, …, aN) is defined as:

  • a1 when N is 1

  • GENERALIZED_NONCOMMUTATIVE_SUM(+, a1, …, aK)

    • GENERALIZED_NONCOMMUTATIVE_SUM(+, aM, …, aN) where 1 < K+1 = M <= N.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • init – The initial value for the generalized sum.

Returns

The exclusive_scan algorithm returns a hpx::future<FwdIter2> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter2 otherwise. The exclusive_scan algorithm returns the output iterator to the element in the destination range, one past the last element copied.

template<typename InIter, typename OutIter, typename T, typename Op>
OutIter exclusive_scan(InIter first, InIter last, OutIter dest, T init, Op &&op)#

Assigns through each iterator i in [result, result + (last - first)) the value of GENERALIZED_NONCOMMUTATIVE_SUM(binary_op, init, *first, …, *(first + (i - result) - 1)).

The reduce operations in the parallel exclusive_scan algorithm invoked without an execution policy object will execute in sequential order in the calling thread.

The difference between exclusive_scan and inclusive_scan is that inclusive_scan includes the ith input element in the ith sum. If op is not mathematically associative, the behavior of inclusive_scan may be non-deterministic.

Note

Complexity: O(last - first) applications of the predicate op.

Note

GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aN) is defined as:

  • a1 when N is 1

  • op(GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aK), GENERALIZED_NONCOMMUTATIVE_SUM(op, aM, …, aN)) where 1 < K+1 = M <= N.

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • OutIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an output iterator.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

  • Op – The type of the binary function object used for the reduction operation.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • init – The initial value for the generalized sum.

  • op – Specifies the function (or function object) which will be invoked for each of the values of the input sequence. This is a binary predicate. The signature of this predicate should be equivalent to:

    Ret fun(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The types Type1 and Ret must be such that an object of a type as given by the input sequence can be implicitly converted to any of those types.

Returns

The exclusive_scan algorithm returns OutIter. The exclusive_scan algorithm returns the output iterator to the element in the destination range, one past the last element copied.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename Op, typename T>
util::detail::algorithm_result_t<ExPolicy, FwdIter2> exclusive_scan(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest, T init, Op &&op)#

Assigns through each iterator i in [result, result + (last - first)) the value of GENERALIZED_NONCOMMUTATIVE_SUM(binary_op, init, *first, …, *(first + (i - result) - 1)).

The reduce operations in the parallel exclusive_scan algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The reduce operations in the parallel exclusive_scan algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

The difference between exclusive_scan and inclusive_scan is that inclusive_scan includes the ith input element in the ith sum. If op is not mathematically associative, the behavior of inclusive_scan may be non-deterministic.

Note

Complexity: O(last - first) applications of the predicate op.

Note

GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aN) is defined as:

  • a1 when N is 1

  • op(GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aK), GENERALIZED_NONCOMMUTATIVE_SUM(op, aM, …, aN)) where 1 < K+1 = M <= N.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Op – The type of the binary function object used for the reduction operation.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • init – The initial value for the generalized sum.

  • op – Specifies the function (or function object) which will be invoked for each of the values of the input sequence. This is a binary predicate. The signature of this predicate should be equivalent to:

    Ret fun(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The types Type1 and Ret must be such that an object of a type as given by the input sequence can be implicitly converted to any of those types.

Returns

The exclusive_scan algorithm returns a hpx::future<OutIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns OutIter otherwise. The exclusive_scan algorithm returns the output iterator to the element in the destination range, one past the last element copied.

hpx::fill, hpx::fill_n#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename ExPolicy, typename FwdIter, typename T>
util::detail::algorithm_result_t<ExPolicy> fill(ExPolicy &&policy, FwdIter first, FwdIter last, T value)#

Assigns the given value to the elements in the range [first, last). Executed according to the policy.

The comparisons in the parallel fill algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparisons in the parallel fill algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • T – The type of the value to be assigned (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • value – The value to be assigned.

Returns

The fill algorithm returns a hpx::future<void> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns difference_type otherwise (where difference_type is defined by void.

template<typename FwdIter, typename T>
void fill(FwdIter first, FwdIter last, T value)#

Assigns the given value to the elements in the range [first, last).

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • T – The type of the value to be assigned (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • value – The value to be assigned.

Returns

The fill algorithm returns a void.

template<typename ExPolicy, typename FwdIter, typename Size, typename T>
util::detail::algorithm_result_t<ExPolicy, FwdIter> fill_n(ExPolicy &&policy, FwdIter first, Size count, T value)#

Assigns the given value value to the first count elements in the range beginning at first if count > 0. Does nothing otherwise. Executed according to the policy.

The comparisons in the parallel fill_n algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparisons in the parallel fill_n algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly count assignments, for count > 0.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • Size – The type of the argument specifying the number of elements to apply f to.

  • T – The type of the value to be assigned (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

  • value – The value to be assigned.

Returns

The fill_n algorithm returns a hpx::future<void> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns difference_type otherwise (where difference_type is defined by void.

template<typename FwdIter, typename Size, typename T>
FwdIter fill_n(FwdIter first, Size count, T value)#

Assigns the given value value to the first count elements in the range beginning at first if count > 0. Does nothing otherwise.

Note

Complexity: Performs exactly count assignments, for count > 0.

Template Parameters
  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • Size – The type of the argument specifying the number of elements to apply f to.

  • T – The type of the value to be assigned (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

  • value – The value to be assigned.

Returns

The fill_n algorithm returns a FwdIter.

hpx::find, hpx::find_if, hpx::find_if_not, hpx::find_end, hpx::find_first_of#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename ExPolicy, typename FwdIter, typename T>
util::detail::algorithm_result_t<ExPolicy, FwdIter> find(ExPolicy &&policy, FwdIter first, FwdIter last, T const &val)#

Returns the first element in the range [first, last) that is equal to value. Executed according to the policy.

The comparison operations in the parallel find algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel find algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most last - first applications of the operator==().

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • T – The type of the value to find (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • val – the value to compare the elements to

Returns

The find algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The find algorithm returns the first element in the range [first,last) that is equal to val. If no such element in the range of [first,last) is equal to val, then the algorithm returns last.

template<typename InIter, typename T>
InIter find(InIter first, InIter last, T const &val)#

Returns the first element in the range [first, last) that is equal to value. Executed according to the policy.

Note

Complexity: At most last - first applications of the operator==().

Template Parameters
  • InIter – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an input iterator.

  • T – The type of the value to find (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • val – the value to compare the elements to

Returns

The find algorithm returns a InIter. The find algorithm returns the first element in the range [first,last) that is equal to val. If no such element in the range of [first,last) is equal to val, then the algorithm returns last.

template<typename ExPolicy, typename FwdIter, typename F>
util::detail::algorithm_result_t<ExPolicy, FwdIter> find_if(ExPolicy &&policy, FwdIter first, FwdIter last, F &&f)#

Returns the first element in the range [first, last) for which predicate f returns true. Executed according to the policy.

The comparison operations in the parallel find_if algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel find_if algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most last - first applications of the predicate.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of a forward iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of equal requires F to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • f – The unary predicate which returns true for the required element. The signature of the predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type.

Returns

The find_if algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The find_if algorithm returns the first element in the range [first,last) that satisfies the predicate f. If no such element exists that satisfies the predicate f, the algorithm returns last.

template<typename InIter, typename F>
InIter find_if(InIter first, InIter last, F &&f)#

Returns the first element in the range [first, last) for which predicate f returns true.

Note

Complexity: At most last - first applications of the predicate.

Template Parameters
  • InIter – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an input iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of equal requires F to meet the requirements of CopyConstructible.

Parameters
  • first – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • f – The unary predicate which returns true for the required element. The signature of the predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type must be such that objects of type InIter can be dereferenced and then implicitly converted to Type.

Returns

The find_if algorithm returns a InIter. The find_if algorithm returns the first element in the range [first,last) that satisfies the predicate f. If no such element exists that satisfies the predicate f, the algorithm returns last.

template<typename ExPolicy, typename FwdIter, typename F>
util::detail::algorithm_result_t<ExPolicy, FwdIter> find_if_not(ExPolicy &&policy, FwdIter first, FwdIter last, F &&f)#

Returns the first element in the range [first, last) for which predicate f returns false. Executed according to the policy.

The comparison operations in the parallel find_if_not algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel find_if_not algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most last - first applications of the predicate.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of a forward iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of equal requires F to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • f – The unary predicate which returns false for the required element. The signature of the predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type.

Returns

The find_if_not algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The find_if_not algorithm returns the first element in the range [first, last) that does not satisfy the predicate f. If no such element exists that does not satisfy the predicate f, the algorithm returns last.

template<typename FwdIter, typename F>
FwdIter find_if_not(FwdIter first, FwdIter last, F &&f)#

Returns the first element in the range [first, last) for which predicate f returns false.

Note

Complexity: At most last - first applications of the predicate.

Template Parameters
  • FwdIter – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of a forward iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of equal requires F to meet the requirements of CopyConstructible.

Parameters
  • first – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • f – The unary predicate which returns false for the required element. The signature of the predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type.

Returns

The find_if_not algorithm returns a FwdIter. The find_if_not algorithm returns the first element in the range [first, last) that does not satisfy the predicate f. If no such element exists that does not satisfy the predicate f, the algorithm returns last.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename Pred = detail::equal_to>
util::detail::algorithm_result_t<ExPolicy, FwdIter1> find_end(ExPolicy &&policy, FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2, Pred &&op = Pred())#

Returns the last subsequence of elements [first2, last2) found in the range [first, last) using the given predicate op to compare elements. Executed according to the policy.

The comparison operations in the parallel find_end algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel find_end algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

This overload of find_end is available if the user decides to provide the algorithm their own predicate op.

Note

Complexity: at most S*(N-S+1) comparisons where S = distance(first2, last2) and N = distance(first1, last1).

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of replace requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements the algorithm will be searching for.

  • last2 – Refers to the end of the sequence of elements of the algorithm will be searching for.

  • op – The binary predicate which returns true if the elements should be treated as equal. The signature should be equivalent to the following:

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively.

Returns

The find_end algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The find_end algorithm returns an iterator to the beginning of the last subsequence [first2, last2) in range [first, last). If the length of the subsequence [first2, last2) is greater than the length of the range [first1, last1), last1 is returned. Additionally if the size of the subsequence is empty or no subsequence is found, last1 is also returned.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2>
util::detail::algorithm_result_t<ExPolicy, FwdIter1> find_end(ExPolicy &&policy, FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2)#

Returns the last subsequence of elements [first2, last2) found in the range [first, last). Elements are compared using operator==. Executed according to the policy.

The comparison operations in the parallel find_end algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel find_end algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: at most S*(N-S+1) comparisons where S = distance(first2, last2) and N = distance(first1, last1).

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements the algorithm will be searching for.

  • last2 – Refers to the end of the sequence of elements of the algorithm will be searching for.

Returns

The find_end algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The find_end algorithm returns an iterator to the beginning of the last subsequence [first2, last2) in range [first, last). If the length of the subsequence [first2, last2) is greater than the length of the range [first1, last1), last1 is returned. Additionally if the size of the subsequence is empty or no subsequence is found, last1 is also returned.

template<typename FwdIter1, typename FwdIter2, typename Pred = detail::equal_to>
FwdIter1 find_end(FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2, Pred &&op = Pred())#

Returns the last subsequence of elements [first2, last2) found in the range [first, last) using the given predicate op to compare elements.

This overload of find_end is available if the user decides to provide the algorithm their own predicate op.

Note

Complexity: at most S*(N-S+1) comparisons where S = distance(first2, last2) and N = distance(first1, last1).

Template Parameters
  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of replace requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

Parameters
  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements the algorithm will be searching for.

  • last2 – Refers to the end of the sequence of elements of the algorithm will be searching for.

  • op – The binary predicate which returns true if the elements should be treated as equal. The signature should be equivalent to the following:

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively.

Returns

The find_end algorithm returns a FwdIter1. The find_end algorithm returns an iterator to the beginning of the last subsequence [first2, last2) in range [first, last). If the length of the subsequence [first2, last2) is greater than the length of the range [first1, last1), last1 is returned. Additionally if the size of the subsequence is empty or no subsequence is found, last1 is also returned.

template<typename FwdIter1, typename FwdIter2>
FwdIter1 find_end(FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2)#

Returns the last subsequence of elements [first2, last2) found in the range [first, last). Elements are compared using operator==.

Note

Complexity: at most S*(N-S+1) comparisons where S = distance(first2, last2) and N = distance(first1, last1).

Template Parameters
  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements the algorithm will be searching for.

  • last2 – Refers to the end of the sequence of elements of the algorithm will be searching for.

Returns

The find_end algorithm returns a FwdIter1. The find_end algorithm returns an iterator to the beginning of the last subsequence [first2, last2) in range [first, last). If the length of the subsequence [first2, last2) is greater than the length of the range [first1, last1), last1 is returned. Additionally if the size of the subsequence is empty or no subsequence is found, last1 is also returned.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename Pred = detail::equal_to>
util::detail::algorithm_result_t<ExPolicy, FwdIter1> find_first_of(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 s_first, FwdIter2 s_last, Pred &&op = Pred())#

Searches the range [first, last) for any elements in the range [s_first, s_last). Uses binary predicate op to compare elements. Executed according to the policy.

The comparison operations in the parallel find_first_of algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel find_first_of algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

This overload of find_first_of is available if the user decides to provide the algorithm their own predicate op.

Note

Complexity: at most (S*N) comparisons where S = distance(s_first, s_last) and N = distance(first, last).

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of equal requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • s_first – Refers to the beginning of the sequence of elements the algorithm will be searching for.

  • s_last – Refers to the end of the sequence of elements of the algorithm will be searching for.

  • op – The binary predicate which returns true if the elements should be treated as equal. The signature should be equivalent to the following:

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively.

Returns

The find_first_of algorithm returns a hpx::future<FwdIter1> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter1 otherwise. The find_first_of algorithm returns an iterator to the first element in the range [first, last) that is equal to an element from the range [s_first, s_last). If the length of the subsequence [s_first, s_last) is greater than the length of the range [first, last), last is returned. Additionally if the size of the subsequence is empty or no subsequence is found, last is also returned.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2>
util::detail::algorithm_result_t<ExPolicy, FwdIter1> find_first_of(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 s_first, FwdIter2 s_last)#

Searches the range [first, last) for any elements in the range [s_first, s_last). Elements are compared using operator==. Executed according to the policy.

The comparison operations in the parallel find_first_of algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel find_first_of algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: at most (S*N) comparisons where S = distance(s_first, s_last) and N = distance(first, last).

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • s_first – Refers to the beginning of the sequence of elements the algorithm will be searching for.

  • s_last – Refers to the end of the sequence of elements of the algorithm will be searching for.

Returns

The find_first_of algorithm returns a hpx::future<FwdIter1> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter1 otherwise. The find_first_of algorithm returns an iterator to the first element in the range [first, last) that is equal to an element from the range [s_first, s_last). If the length of the subsequence [s_first, s_last) is greater than the length of the range [first, last), last is returned. Additionally if the size of the subsequence is empty or no subsequence is found, last is also returned.

template<typename FwdIter1, typename FwdIter2, typename Pred = detail::equal_to>
FwdIter1 find_first_of(FwdIter1 first, FwdIter1 last, FwdIter2 s_first, FwdIter2 s_last, Pred &&op = Pred())#

Searches the range [first, last) for any elements in the range [s_first, s_last). Uses binary predicate op to compare elements.

This overload of find_first_of is available if the user decides to provide the algorithm their own predicate op.

Note

Complexity: at most (S*N) comparisons where S = distance(s_first, s_last) and N = distance(first, last).

Template Parameters
  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of equal requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

Parameters
  • first – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • s_first – Refers to the beginning of the sequence of elements the algorithm will be searching for.

  • s_last – Refers to the end of the sequence of elements of the algorithm will be searching for.

  • op – The binary predicate which returns true if the elements should be treated as equal. The signature should be equivalent to the following:

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively.

Returns

The find_first_of algorithm returns a FwdIter1. The find_first_of algorithm returns an iterator to the first element in the range [first, last) that is equal to an element from the range [s_first, s_last). If the length of the subsequence [s_first, s_last) is greater than the length of the range [first, last), last is returned. Additionally if the size of the subsequence is empty or no subsequence is found, last is also returned.

template<typename FwdIter1, typename FwdIter2>
FwdIter1 find_first_of(FwdIter1 first, FwdIter1 last, FwdIter2 s_first, FwdIter2 s_last)#

Searches the range [first, last) for any elements in the range [s_first, s_last). Elements are compared using operator==.

Note

Complexity: at most (S*N) comparisons where S = distance(s_first, s_last) and N = distance(first, last).

Template Parameters
  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • s_first – Refers to the beginning of the sequence of elements the algorithm will be searching for.

  • s_last – Refers to the end of the sequence of elements of the algorithm will be searching for.

Returns

The find_first_of algorithm returns a FwdIter1. The find_first_of algorithm returns an iterator to the first element in the range [first, last) that is equal to an element from the range [s_first, s_last). If the length of the subsequence [s_first, s_last) is greater than the length of the range [first, last), last is returned. Additionally if the size of the subsequence is empty or no subsequence is found, last is also returned.

hpx::for_each, hpx::for_each_n#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename InIter, typename F>
F for_each(InIter first, InIter last, F &&f)#

Applies f to the result of dereferencing every iterator in the range [first, last).

If f returns a result, the result is ignored.

If the type of first satisfies the requirements of a mutable iterator, f may apply non-constant functions through the dereferenced iterator.

Note

Complexity: Applies f exactly last - first times.

Template Parameters
  • InIter – The type of the source begin and end iterator used (deduced). This iterator type must meet the requirements of an input iterator.

  • F – The type of the function/function object to use (deduced). F must meet requirements of MoveConstructible.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    <ignored> pred(const Type &a);
    
    The signature does not need to have const&. The type Type must be such that an object of type InIter can be dereferenced and then implicitly converted to Type.

Returns

f.

template<typename ExPolicy, typename FwdIter, typename F>
util::detail::algorithm_result_t<ExPolicy, void> for_each(ExPolicy &&policy, FwdIter first, FwdIter last, F &&f)#

Applies f to the result of dereferencing every iterator in the range [first, last). Executed according to the policy.

If f returns a result, the result is ignored.

If the type of first satisfies the requirements of a mutable iterator, f may apply non-constant functions through the dereferenced iterator.

Unlike its sequential form, the parallel overload of for_each does not return a copy of its Function parameter, since parallelization may not permit efficient state accumulation.

The application of function objects in parallel algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Applies f exactly last - first times.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • FwdIter – The type of the source begin and end iterator used (deduced). This iterator type must meet the requirements of a forward iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of for_each requires F to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    <ignored> pred(const Type &a);
    
    The signature does not need to have const&. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

Returns

The for_each algorithm returns a hpx::future<void> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns void otherwise.

template<typename InIter, typename Size, typename F>
InIter for_each_n(InIter first, Size count, F &&f)#

Applies f to the result of dereferencing every iterator in the range [first, first + count), starting from first and proceeding to first + count - 1.

If f returns a result, the result is ignored.

If the type of first satisfies the requirements of a mutable iterator, f may apply non-constant functions through the dereferenced iterator.

Note

Complexity: Applies f exactly count times.

Template Parameters
  • InIter – The type of the source begin and end iterator used (deduced). This iterator type must meet the requirements of an input iterator.

  • Size – The type of the argument specifying the number of elements to apply f to.

  • F – The type of the function/function object to use (deduced). F must meet requirements of MoveConstructible.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    <ignored> pred(const Type &a);
    
    The signature does not need to have const&. The type Type must be such that an object of type InIter can be dereferenced and then implicitly converted to Type.

Returns

first + count for non-negative values of count and first for negative values.

template<typename ExPolicy, typename FwdIter, typename Size, typename F>
util::detail::algorithm_result_t<ExPolicy, FwdIter> for_each_n(ExPolicy &&policy, FwdIter first, Size count, F &&f)#

Applies f to the result of dereferencing every iterator in the range [first, first + count), starting from first and proceeding to first + count - 1. Executed according to the policy.

If f returns a result, the result is ignored.

If the type of first satisfies the requirements of a mutable iterator, f may apply non-constant functions through the dereferenced iterator.

Unlike its sequential form, the parallel overload of for_each_n does not return a copy of its Function parameter, since parallelization may not permit efficient state accumulation.

The application of function objects in parallel algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Applies f exactly count times.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Size – The type of the argument specifying the number of elements to apply f to.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of for_each_n requires F to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    <ignored> pred(const Type &a);
    
    The signature does not need to have const&. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

Returns

The for_each_n algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. It returns first + count for non-negative values of count and first for negative values.

hpx::experimental::for_loop, hpx::experimental::for_loop_strided, hpx::experimental::for_loop_n, hpx::experimental::for_loop_n_strided#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace experimental#

Top-level namespace.

Functions

template<typename I, typename ...Args>
void for_loop(std::decay_t<I> first, I last, Args&&... args)#

The for_loop implements loop functionality over a range specified by integral or iterator bounds. For the iterator case, these algorithms resemble for_each from the Parallelism TS, but leave to the programmer when and if to dereference the iterator.

The execution of for_loop without specifying an execution policy is equivalent to specifying hpx::execution::seq as the execution policy.

Requires: I shall be an integral type or meet the requirements of an input iterator type. The args parameter pack shall have at least one element, comprising objects returned by invocations of reduction and/or induction function templates followed by exactly one element invocable element-access function, f. f shall meet the requirements of MoveConstructible.

Effects: Applies f to each element in the input sequence, with additional arguments corresponding to the reductions and inductions in the args parameter pack. The length of the input sequence is last - first.

The first element in the input sequence is specified by first. Each subsequent element is generated by incrementing the previous element.

Along with an element from the input sequence, for each member of the args parameter pack excluding f, an additional argument is passed to each application of f as follows:

If the pack member is an object returned by a call to a reduction function listed in section, then the additional argument is a reference to a view of that reduction object. If the pack member is an object returned by a call to induction, then the additional argument is the induction value for that induction object corresponding to the position of the application of f in the input sequence.

Complexity: Applies f exactly once for each element of the input sequence.

Remarks: If f returns a result, the result is ignored.

Note

As described in the C++ standard, arithmetic on non-random-access iterators is performed using advance and distance.

Note

The order of the elements of the input sequence is important for determining ordinal position of an application of f, even though the applications themselves may be unordered.

Template Parameters
  • I – The type of the iteration variable. This could be an (forward) iterator type or an integral type.

  • Args – A parameter pack, it’s last element is a function object to be invoked for each iteration, the others have to be either conforming to the induction or reduction concept.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • args – The last element of this parameter pack is the function (object) to invoke, while the remaining elements of the parameter pack are instances of either induction or reduction objects. The function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last) should expose a signature equivalent to:

    <ignored> pred(I const& a, ...);
    
    The signature does not need to have const&. It will receive the current value of the iteration variable and one argument for each of the induction or reduction objects passed to the algorithms, representing their current values.

template<typename ExPolicy, typename I, typename... Args> < unspecified > for_loop (ExPolicy &&policy, std::decay_t< I > first, I last, Args &&... args)

The for_loop implements loop functionality over a range specified by integral or iterator bounds. For the iterator case, these algorithms resemble for_each from the Parallelism TS, but leave to the programmer when and if to dereference the iterator. Executed according to the policy.

Requires: I shall be an integral type or meet the requirements of an input iterator type. The args parameter pack shall have at least one element, comprising objects returned by invocations of reduction and/or induction function templates followed by exactly one element invocable element-access function, f. f shall meet the requirements of MoveConstructible.

Effects: Applies f to each element in the input sequence, with additional arguments corresponding to the reductions and inductions in the args parameter pack. The length of the input sequence is last - first.

The first element in the input sequence is specified by first. Each subsequent element is generated by incrementing the previous element.

Along with an element from the input sequence, for each member of the args parameter pack excluding f, an additional argument is passed to each application of f as follows:

If the pack member is an object returned by a call to a reduction function listed in section, then the additional argument is a reference to a view of that reduction object. If the pack member is an object returned by a call to induction, then the additional argument is the induction value for that induction object corresponding to the position of the application of f in the input sequence.

Complexity: Applies f exactly once for each element of the input sequence.

Remarks: If f returns a result, the result is ignored.

Note

As described in the C++ standard, arithmetic on non-random-access iterators is performed using advance and distance.

Note

The order of the elements of the input sequence is important for determining ordinal position of an application of f, even though the applications themselves may be unordered.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • I – The type of the iteration variable. This could be an (forward) iterator type or an integral type.

  • Args – A parameter pack, it’s last element is a function object to be invoked for each iteration, the others have to be either conforming to the induction or reduction concept.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • args – The last element of this parameter pack is the function (object) to invoke, while the remaining elements of the parameter pack are instances of either induction or reduction objects. The function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last) should expose a signature equivalent to:

    <ignored> pred(I const& a, ...);
    
    The signature does not need to have const&. It will receive the current value of the iteration variable and one argument for each of the induction or reduction objects passed to the algorithms, representing their current values.

Returns

The for_loop algorithm returns a hpx::future<void> if the execution policy is of type hpx::execution::sequenced_task_policy or hpx::execution::parallel_task_policy and returns void otherwise.

template<typename I, typename S, typename ...Args>
void for_loop_strided(std::decay_t<I> first, I last, S stride, Args&&... args)#

The for_loop_strided implements loop functionality over a range specified by integral or iterator bounds. For the iterator case, these algorithms resemble for_each from the Parallelism TS, but leave to the programmer when and if to dereference the iterator.

The execution of for_loop without specifying an execution policy is equivalent to specifying hpx::execution::seq as the execution policy.

Requires: I shall be an integral type or meet the requirements of an input iterator type. The args parameter pack shall have at least one element, comprising objects returned by invocations of reduction and/or induction function templates followed by exactly one element invocable element-access function, f. f shall meet the requirements of MoveConstructible.

Effects: Applies f to each element in the input sequence, with additional arguments corresponding to the reductions and inductions in the args parameter pack. The length of the input sequence is last - first.

The first element in the input sequence is specified by first. Each subsequent element is generated by incrementing the previous element.

Along with an element from the input sequence, for each member of the args parameter pack excluding f, an additional argument is passed to each application of f as follows:

If the pack member is an object returned by a call to a reduction function listed in section, then the additional argument is a reference to a view of that reduction object. If the pack member is an object returned by a call to induction, then the additional argument is the induction value for that induction object corresponding to the position of the application of f in the input sequence.

Complexity: Applies f exactly once for each element of the input sequence.

Remarks: If f returns a result, the result is ignored.

Note

As described in the C++ standard, arithmetic on non-random-access iterators is performed using advance and distance.

Note

The order of the elements of the input sequence is important for determining ordinal position of an application of f, even though the applications themselves may be unordered.

Template Parameters
  • I – The type of the iteration variable. This could be an (forward) iterator type or an integral type.

  • S – The type of the stride variable. This should be an integral type.

  • Args – A parameter pack, it’s last element is a function object to be invoked for each iteration, the others have to be either conforming to the induction or reduction concept.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • stride – Refers to the stride of the iteration steps. This shall have non-zero value and shall be negative only if I has integral type or meets the requirements of a bidirectional iterator.

  • args – The last element of this parameter pack is the function (object) to invoke, while the remaining elements of the parameter pack are instances of either induction or reduction objects. The function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last) should expose a signature equivalent to:

    <ignored> pred(I const& a, ...);
    
    The signature does not need to have const&. It will receive the current value of the iteration variable and one argument for each of the induction or reduction objects passed to the algorithms, representing their current values.

template<typename ExPolicy, typename I, typename S, typename... Args> < unspecified > for_loop_strided (ExPolicy &&policy, std::decay_t< I > first, I last, S stride, Args &&... args)

The for_loop_strided implements loop functionality over a range specified by integral or iterator bounds. For the iterator case, these algorithms resemble for_each from the Parallelism TS, but leave to the programmer when and if to dereference the iterator. Executed according to the policy.

Requires: I shall be an integral type or meet the requirements of an input iterator type. The args parameter pack shall have at least one element, comprising objects returned by invocations of reduction and/or induction function templates followed by exactly one element invocable element-access function, f. f shall meet the requirements of MoveConstructible.

Effects: Applies f to each element in the input sequence, with additional arguments corresponding to the reductions and inductions in the args parameter pack. The length of the input sequence is last - first.

The first element in the input sequence is specified by first. Each subsequent element is generated by incrementing the previous element.

Along with an element from the input sequence, for each member of the args parameter pack excluding f, an additional argument is passed to each application of f as follows:

If the pack member is an object returned by a call to a reduction function listed in section, then the additional argument is a reference to a view of that reduction object. If the pack member is an object returned by a call to induction, then the additional argument is the induction value for that induction object corresponding to the position of the application of f in the input sequence.

Complexity: Applies f exactly once for each element of the input sequence.

Remarks: If f returns a result, the result is ignored.

Note

As described in the C++ standard, arithmetic on non-random-access iterators is performed using advance and distance.

Note

The order of the elements of the input sequence is important for determining ordinal position of an application of f, even though the applications themselves may be unordered.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • I – The type of the iteration variable. This could be an (forward) iterator type or an integral type.

  • S – The type of the stride variable. This should be an integral type.

  • Args – A parameter pack, it’s last element is a function object to be invoked for each iteration, the others have to be either conforming to the induction or reduction concept.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • stride – Refers to the stride of the iteration steps. This shall have non-zero value and shall be negative only if I has integral type or meets the requirements of a bidirectional iterator.

  • args – The last element of this parameter pack is the function (object) to invoke, while the remaining elements of the parameter pack are instances of either induction or reduction objects. The function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last) should expose a signature equivalent to:

    <ignored> pred(I const& a, ...);
    
    The signature does not need to have const&. It will receive the current value of the iteration variable and one argument for each of the induction or reduction objects passed to the algorithms, representing their current values.

Returns

The for_loop_strided algorithm returns a hpx::future<void> if the execution policy is of type hpx::execution::sequenced_task_policy or hpx::execution::parallel_task_policy and returns void otherwise.

template<typename I, typename Size, typename ...Args>
void for_loop_n(I first, Size size, Args&&... args)#

The for_loop_n implements loop functionality over a range specified by integral or iterator bounds. For the iterator case, these algorithms resemble for_each from the Parallelism TS, but leave to the programmer when and if to dereference the iterator.

The execution of for_loop_n without specifying an execution policy is equivalent to specifying hpx::execution::seq as the execution policy.

Requires: I shall be an integral type or meet the requirements of an input iterator type. The args parameter pack shall have at least one element, comprising objects returned by invocations of reduction and/or induction function templates followed by exactly one element invocable element-access function, f. f shall meet the requirements of MoveConstructible.

Effects: Applies f to each element in the input sequence, with additional arguments corresponding to the reductions and inductions in the args parameter pack. The length of the input sequence is last - first.

The first element in the input sequence is specified by first. Each subsequent element is generated by incrementing the previous element.

Along with an element from the input sequence, for each member of the args parameter pack excluding f, an additional argument is passed to each application of f as follows:

If the pack member is an object returned by a call to a reduction function listed in section, then the additional argument is a reference to a view of that reduction object. If the pack member is an object returned by a call to induction, then the additional argument is the induction value for that induction object corresponding to the position of the application of f in the input sequence.

Complexity: Applies f exactly once for each element of the input sequence.

Remarks: If f returns a result, the result is ignored.

Note

As described in the C++ standard, arithmetic on non-random-access iterators is performed using advance and distance.

Note

The order of the elements of the input sequence is important for determining ordinal position of an application of f, even though the applications themselves may be unordered.

Template Parameters
  • I – The type of the iteration variable. This could be an (forward) iterator type or an integral type.

  • Size – The type of a non-negative integral value specifying the number of items to iterate over.

  • Args – A parameter pack, it’s last element is a function object to be invoked for each iteration, the others have to be either conforming to the induction or reduction concept.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • size – Refers to the number of items the algorithm will be applied to.

  • args – The last element of this parameter pack is the function (object) to invoke, while the remaining elements of the parameter pack are instances of either induction or reduction objects. The function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last) should expose a signature equivalent to:

    <ignored> pred(I const& a, ...);
    
    The signature does not need to have const&. It will receive the current value of the iteration variable and one argument for each of the induction or reduction objects passed to the algorithms, representing their current values.

template<typename ExPolicy, typename I, typename Size, typename... Args> < unspecified > for_loop_n (ExPolicy &&policy, I first, Size size, Args &&... args)

The for_loop_n implements loop functionality over a range specified by integral or iterator bounds. For the iterator case, these algorithms resemble for_each from the Parallelism TS, but leave to the programmer when and if to dereference the iterator. Executed according to the policy.

Requires: I shall be an integral type or meet the requirements of an input iterator type. The args parameter pack shall have at least one element, comprising objects returned by invocations of reduction and/or induction function templates followed by exactly one element invocable element-access function, f. f shall meet the requirements of MoveConstructible.

Effects: Applies f to each element in the input sequence, with additional arguments corresponding to the reductions and inductions in the args parameter pack. The length of the input sequence is last - first.

The first element in the input sequence is specified by first. Each subsequent element is generated by incrementing the previous element.

Along with an element from the input sequence, for each member of the args parameter pack excluding f, an additional argument is passed to each application of f as follows:

If the pack member is an object returned by a call to a reduction function listed in section, then the additional argument is a reference to a view of that reduction object. If the pack member is an object returned by a call to induction, then the additional argument is the induction value for that induction object corresponding to the position of the application of f in the input sequence.

Complexity: Applies f exactly once for each element of the input sequence.

Remarks: If f returns a result, the result is ignored.

Note

As described in the C++ standard, arithmetic on non-random-access iterators is performed using advance and distance.

Note

The order of the elements of the input sequence is important for determining ordinal position of an application of f, even though the applications themselves may be unordered.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • I – The type of the iteration variable. This could be an (forward) iterator type or an integral type.

  • Size – The type of a non-negative integral value specifying the number of items to iterate over.

  • Args – A parameter pack, it’s last element is a function object to be invoked for each iteration, the others have to be either conforming to the induction or reduction concept.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • size – Refers to the number of items the algorithm will be applied to.

  • args – The last element of this parameter pack is the function (object) to invoke, while the remaining elements of the parameter pack are instances of either induction or reduction objects. The function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last) should expose a signature equivalent to:

    <ignored> pred(I const& a, ...);
    
    The signature does not need to have const&. It will receive the current value of the iteration variable and one argument for each of the induction or reduction objects passed to the algorithms, representing their current values.

Returns

The for_loop_n algorithm returns a hpx::future<void> if the execution policy is of type hpx::execution::sequenced_task_policy or hpx::execution::parallel_task_policy and returns void otherwise.

template<typename I, typename Size, typename S, typename ...Args>
void for_loop_n_strided(I first, Size size, S stride, Args&&... args)#

The for_loop_n_strided implements loop functionality over a range specified by integral or iterator bounds. For the iterator case, these algorithms resemble for_each from the Parallelism TS, but leave to the programmer when and if to dereference the iterator.

The execution of for_loop without specifying an execution policy is equivalent to specifying hpx::execution::seq as the execution policy.

Requires: I shall be an integral type or meet the requirements of an input iterator type. The args parameter pack shall have at least one element, comprising objects returned by invocations of reduction and/or induction function templates followed by exactly one element invocable element-access function, f. f shall meet the requirements of MoveConstructible.

Effects: Applies f to each element in the input sequence, with additional arguments corresponding to the reductions and inductions in the args parameter pack. The length of the input sequence is last - first.

The first element in the input sequence is specified by first. Each subsequent element is generated by incrementing the previous element.

Along with an element from the input sequence, for each member of the args parameter pack excluding f, an additional argument is passed to each application of f as follows:

If the pack member is an object returned by a call to a reduction function listed in section, then the additional argument is a reference to a view of that reduction object. If the pack member is an object returned by a call to induction, then the additional argument is the induction value for that induction object corresponding to the position of the application of f in the input sequence.

Complexity: Applies f exactly once for each element of the input sequence.

Remarks: If f returns a result, the result is ignored.

Note

As described in the C++ standard, arithmetic on non-random-access iterators is performed using advance and distance.

Note

The order of the elements of the input sequence is important for determining ordinal position of an application of f, even though the applications themselves may be unordered.

Template Parameters
  • I – The type of the iteration variable. This could be an (forward) iterator type or an integral type.

  • Size – The type of a non-negative integral value specifying the number of items to iterate over.

  • S – The type of the stride variable. This should be an integral type.

  • Args – A parameter pack, it’s last element is a function object to be invoked for each iteration, the others have to be either conforming to the induction or reduction concept.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • size – Refers to the number of items the algorithm will be applied to.

  • stride – Refers to the stride of the iteration steps. This shall have non-zero value and shall be negative only if I has integral type or meets the requirements of a bidirectional iterator.

  • args – The last element of this parameter pack is the function (object) to invoke, while the remaining elements of the parameter pack are instances of either induction or reduction objects. The function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last) should expose a signature equivalent to:

    <ignored> pred(I const& a, ...);
    
    The signature does not need to have const&. It will receive the current value of the iteration variable and one argument for each of the induction or reduction objects passed to the algorithms, representing their current values.

template<typename ExPolicy, typename I, typename Size, typename S, typename... Args> < unspecified > for_loop_n_strided (ExPolicy &&policy, I first, Size size, S stride, Args &&... args)

The for_loop_n_strided implements loop functionality over a range specified by integral or iterator bounds. For the iterator case, these algorithms resemble for_each from the Parallelism TS, but leave to the programmer when and if to dereference the iterator. Executed according to the policy.

Requires: I shall be an integral type or meet the requirements of an input iterator type. The args parameter pack shall have at least one element, comprising objects returned by invocations of reduction and/or induction function templates followed by exactly one element invocable element-access function, f. f shall meet the requirements of MoveConstructible.

Effects: Applies f to each element in the input sequence, with additional arguments corresponding to the reductions and inductions in the args parameter pack. The length of the input sequence is last - first.

The first element in the input sequence is specified by first. Each subsequent element is generated by incrementing the previous element.

Along with an element from the input sequence, for each member of the args parameter pack excluding f, an additional argument is passed to each application of f as follows:

If the pack member is an object returned by a call to a reduction function listed in section, then the additional argument is a reference to a view of that reduction object. If the pack member is an object returned by a call to induction, then the additional argument is the induction value for that induction object corresponding to the position of the application of f in the input sequence.

Complexity: Applies f exactly once for each element of the input sequence.

Remarks: If f returns a result, the result is ignored.

Note

As described in the C++ standard, arithmetic on non-random-access iterators is performed using advance and distance.

Note

The order of the elements of the input sequence is important for determining ordinal position of an application of f, even though the applications themselves may be unordered.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • I – The type of the iteration variable. This could be an (forward) iterator type or an integral type.

  • Size – The type of a non-negative integral value specifying the number of items to iterate over.

  • S – The type of the stride variable. This should be an integral type.

  • Args – A parameter pack, it’s last element is a function object to be invoked for each iteration, the others have to be either conforming to the induction or reduction concept.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • size – Refers to the number of items the algorithm will be applied to.

  • stride – Refers to the stride of the iteration steps. This shall have non-zero value and shall be negative only if I has integral type or meets the requirements of a bidirectional iterator.

  • args – The last element of this parameter pack is the function (object) to invoke, while the remaining elements of the parameter pack are instances of either induction or reduction objects. The function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last) should expose a signature equivalent to:

    <ignored> pred(I const& a, ...);
    
    The signature does not need to have const&. It will receive the current value of the iteration variable and one argument for each of the induction or reduction objects passed to the algorithms, representing their current values.

Returns

The for_loop_n_strided algorithm returns a hpx::future<void> if the execution policy is of type hpx::execution::sequenced_task_policy or hpx::execution::parallel_task_policy and returns void otherwise.

hpx::experimental::induction#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace experimental

Top-level namespace.

Functions

template<typename T> constexpr HPX_CXX_EXPORT hpx::parallel::detail::induction_stride_helper< T > induction (T &&value, std::size_t stride)

The function template returns an induction object of unspecified type having a value type and encapsulating an initial value of that type and, optionally, a stride.

For each element in the input range, a looping algorithm over input sequence S computes an induction value from an induction variable and ordinal position p within S by the formula i + p * stride if a stride was specified or i + p otherwise. This induction value is passed to the element access function.

If the value argument to induction is a non-const lvalue, then that lvalue becomes the live-out object for the returned induction object. For each induction object that has a live-out object, the looping algorithm assigns the value of i + n * stride to the live-out object upon return, where n is the number of elements in the input range.

Template Parameters

T – The value type to be used by the induction object.

Parameters
  • value – [in] The initial value to use for the induction object

  • stride – [in] The (optional) stride to use for the induction object (default: 1)

Returns

This returns an induction object with value type T, initial value, and (if specified) stride. If T is a lvalue of non-const type, value is used as the live-out object for the induction object; otherwise there is no live-out object.

template<typename T> constexpr HPX_CXX_EXPORT hpx::parallel::detail::induction_helper< T > induction (T &&value)
hpx::experimental::reduction#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace experimental

Top-level namespace.

Functions

template<typename T, typename Op> constexpr HPX_CXX_EXPORT hpx::parallel::detail::reduction_helper< T, std::decay_t< Op > > reduction (T &var, T const &identity, Op &&combiner)

The function template returns a reduction object of unspecified type having a value type and encapsulating an identity value for the reduction, a combiner function object, and a live-out object from which the initial value is obtained and into which the final value is stored.

A parallel algorithm uses reduction objects by allocating an unspecified number of instances, called views, of the reduction’s value type. Each view is initialized with the reduction object’s identity value, except that the live-out object (which was initialized by the caller) comprises one of the views. The algorithm passes a reference to a view to each application of an element-access function, ensuring that no two concurrently-executing invocations share the same view. A view can be shared between two applications that do not execute concurrently, but initialization is performed only once per view.

Modifications to the view by the application of element access functions accumulate as partial results. At some point before the algorithm returns, the partial results are combined, two at a time, using the reduction object’s combiner operation until a single value remains, which is then assigned back to the live-out object.

T shall meet the requirements of CopyConstructible and MoveAssignable. The expression

var = combiner(var, var) 
shall be well-formed.

Note

In order to produce useful results, modifications to the view should be limited to commutative operations closely related to the combiner operation. For example if the combiner is plus<T>, incrementing the view would be consistent with the combiner but doubling it or assigning to it would not.

Template Parameters
  • T – The value type to be used by the induction object.

  • Op – The type of the binary function (object) used to perform the reduction operation.

Parameters
  • var – [in,out] The life-out value to use for the reduction object. This will hold the reduced value after the algorithm is finished executing.

  • identity – [in] The identity value to use for the reduction operation. This argument is optional and defaults to a copy of var.

  • combiner – [in] The binary function (object) used to perform a pairwise reduction on the elements.

Returns

This returns a reduction object of unspecified type having a value type of T. When the return value is used by an algorithm, the reference to var is used as the live-out object, new views are initialized to a copy of identity, and views are combined by invoking the copy of combiner, passing it the two views to be combined.

template<typename T, typename Op> constexpr HPX_CXX_EXPORT hpx::parallel::detail::reduction_helper< T, std::decay_t< Op > > reduction (T &var, Op &&combiner)
hpx::experimental::reduction_bit_and#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace experimental

Top-level namespace.

Functions

template<typename T> constexpr HPX_CXX_EXPORT hpx::parallel::detail::reduction_helper< T, std::bit_and< T > > reduction_bit_and (T &var)

The function template reduction_bit_and returns a reduction object of unspecified type having a value type and encapsulating an identity value for the reduction, it uses std::bit_and{} as its combiner function, and a live-out object from which the initial value is obtained and into which the final value is stored.

A parallel algorithm uses the reduction_bit_and object by allocating an unspecified number of instances, called views, of the reduction’s value type. Each view is initialized with the reduction object’s identity value, except that the live-out object (which was initialized by the caller) comprises one of the views. The algorithm passes a reference to a view to each application of an element-access function, ensuring that no two concurrently-executing invocations share the same view. A view can be shared between two applications that do not execute concurrently, but initialization is performed only once per view.

Modifications to the view by the application of element access functions accumulate as partial results. At some point before the algorithm returns, the partial results are combined, two at a time, using the std::bit_and{} operation until a single value remains, which is then assigned back to the live-out object.

T shall meet the requirements of CopyConstructible and MoveAssignable.

Note

In order to produce useful results, modifications to the view should be limited to commutative operations closely related to the combiner operation.

Template Parameters
  • T – The value type to be used by the induction object.

  • Op – The type of the binary function (object) used to perform the reduction operation.

Parameters
  • var – [in,out] The life-out value to use for the reduction object. This will hold the reduced value after the algorithm is finished executing.

  • identity – [in] The identity value to use for the reduction operation. This argument is optional and defaults to a copy of var.

Returns

This returns a reduction object of unspecified type having a value type of T. When the return value is used by an algorithm, the reference to var is used as the live-out object, new views are initialized to a copy of identity, and views are combined by invoking the copy of combiner, passing it the two views to be combined.

template<typename T> constexpr HPX_CXX_EXPORT hpx::parallel::detail::reduction_helper< T, std::bit_and< T > > reduction_bit_and (T &var, T const &identity)
hpx::experimental::reduction_bit_or#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace experimental

Top-level namespace.

Functions

template<typename T> constexpr HPX_CXX_EXPORT hpx::parallel::detail::reduction_helper< T, std::bit_or< T > > reduction_bit_or (T &var)

The function template reduction_bit_or returns a reduction object of unspecified type having a value type and encapsulating an identity value for the reduction, it uses std::bit_or{} as its combiner function, and a live-out object from which the initial value is obtained and into which the final value is stored.

A parallel algorithm uses the reduction_bit_or object by allocating an unspecified number of instances, called views, of the reduction’s value type. Each view is initialized with the reduction object’s identity value, except that the live-out object (which was initialized by the caller) comprises one of the views. The algorithm passes a reference to a view to each application of an element-access function, ensuring that no two concurrently-executing invocations share the same view. A view can be shared between two applications that do not execute concurrently, but initialization is performed only once per view.

Modifications to the view by the application of element access functions accumulate as partial results. At some point before the algorithm returns, the partial results are combined, two at a time, using the std::bit_or{} operation until a single value remains, which is then assigned back to the live-out object.

T shall meet the requirements of CopyConstructible and MoveAssignable.

Note

In order to produce useful results, modifications to the view should be limited to commutative operations closely related to the combiner operation.

Template Parameters
  • T – The value type to be used by the induction object.

  • Op – The type of the binary function (object) used to perform the reduction operation.

Parameters
  • var – [in,out] The life-out value to use for the reduction object. This will hold the reduced value after the algorithm is finished executing.

  • identity – [in] The identity value to use for the reduction operation. This argument is optional and defaults to a copy of var.

Returns

This returns a reduction object of unspecified type having a value type of T. When the return value is used by an algorithm, the reference to var is used as the live-out object, new views are initialized to a copy of identity, and views are combined by invoking the copy of combiner, passing it the two views to be combined.

template<typename T> constexpr HPX_CXX_EXPORT hpx::parallel::detail::reduction_helper< T, std::bit_or< T > > reduction_bit_or (T &var, T const &identity)
hpx::experimental::reduction_bit_xor#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace experimental

Top-level namespace.

Functions

template<typename T> constexpr HPX_CXX_EXPORT hpx::parallel::detail::reduction_helper< T, std::bit_xor< T > > reduction_bit_xor (T &var)

The function template reduction_bit_xor returns a reduction object of unspecified type having a value type and encapsulating an identity value for the reduction, it uses std::bit_xor{} as its combiner function, and a live-out object from which the initial value is obtained and into which the final value is stored.

A parallel algorithm uses the reduction_bit_xor object by allocating an unspecified number of instances, called views, of the reduction’s value type. Each view is initialized with the reduction object’s identity value, except that the live-out object (which was initialized by the caller) comprises one of the views. The algorithm passes a reference to a view to each application of an element-access function, ensuring that no two concurrently-executing invocations share the same view. A view can be shared between two applications that do not execute concurrently, but initialization is performed only once per view.

Modifications to the view by the application of element access functions accumulate as partial results. At some point before the algorithm returns, the partial results are combined, two at a time, using the std::bit_xor{} operation until a single value remains, which is then assigned back to the live-out object.

T shall meet the requirements of CopyConstructible and MoveAssignable.

Note

In order to produce useful results, modifications to the view should be limited to commutative operations closely related to the combiner operation.

Template Parameters
  • T – The value type to be used by the induction object.

  • Op – The type of the binary function (object) used to perform the reduction operation.

Parameters
  • var – [in,out] The life-out value to use for the reduction object. This will hold the reduced value after the algorithm is finished executing.

  • identity – [in] The identity value to use for the reduction operation. This argument is optional and defaults to a copy of var.

Returns

This returns a reduction object of unspecified type having a value type of T. When the return value is used by an algorithm, the reference to var is used as the live-out object, new views are initialized to a copy of identity, and views are combined by invoking the copy of combiner, passing it the two views to be combined.

template<typename T> constexpr HPX_CXX_EXPORT hpx::parallel::detail::reduction_helper< T, std::bit_xor< T > > reduction_bit_xor (T &var, T const &identity)
hpx::experimental::reduction_max#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace experimental

Top-level namespace.

Functions

template<typename T> constexpr HPX_CXX_EXPORT hpx::parallel::detail::reduction_helper< T, hpx::parallel::detail::max_of< T > > reduction_max (T &var)

The function template reduction_max returns a reduction object of unspecified type having a value type and encapsulating an identity value for the reduction, it uses std::max_of{} as its combiner function, and a live-out object from which the initial value is obtained and into which the final value is stored.

A parallel algorithm uses the reduction_max object by allocating an unspecified number of instances, called views, of the reduction’s value type. Each view is initialized with the reduction object’s identity value, except that the live-out object (which was initialized by the caller) comprises one of the views. The algorithm passes a reference to a view to each application of an element-access function, ensuring that no two concurrently-executing invocations share the same view. A view can be shared between two applications that do not execute concurrently, but initialization is performed only once per view.

Modifications to the view by the application of element access functions accumulate as partial results. At some point before the algorithm returns, the partial results are combined, two at a time, using the std::max_of{} operation until a single value remains, which is then assigned back to the live-out object.

T shall meet the requirements of CopyConstructible and MoveAssignable.

Note

In order to produce useful results, modifications to the view should be limited to commutative operations closely related to the combiner operation.

Template Parameters
  • T – The value type to be used by the induction object.

  • Op – The type of the binary function (object) used to perform the reduction operation.

Parameters
  • var – [in,out] The life-out value to use for the reduction object. This will hold the reduced value after the algorithm is finished executing.

  • identity – [in] The identity value to use for the reduction operation. This argument is optional and defaults to a copy of var.

Returns

This returns a reduction object of unspecified type having a value type of T. When the return value is used by an algorithm, the reference to var is used as the live-out object, new views are initialized to a copy of identity, and views are combined by invoking the copy of combiner, passing it the two views to be combined.

template<typename T> constexpr HPX_CXX_EXPORT hpx::parallel::detail::reduction_helper< T, hpx::parallel::detail::max_of< T > > reduction_max (T &var, T const &identity)
hpx::experimental::reduction_min#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace experimental

Top-level namespace.

Functions

template<typename T> constexpr HPX_CXX_EXPORT hpx::parallel::detail::reduction_helper< T, hpx::parallel::detail::min_of< T > > reduction_min (T &var)

The function template reduction_min returns a reduction object of unspecified type having a value type and encapsulating an identity value for the reduction, it uses std::min_of{} as its combiner function, and a live-out object from which the initial value is obtained and into which the final value is stored.

A parallel algorithm uses the reduction_min object by allocating an unspecified number of instances, called views, of the reduction’s value type. Each view is initialized with the reduction object’s identity value, except that the live-out object (which was initialized by the caller) comprises one of the views. The algorithm passes a reference to a view to each application of an element-access function, ensuring that no two concurrently-executing invocations share the same view. A view can be shared between two applications that do not execute concurrently, but initialization is performed only once per view.

Modifications to the view by the application of element access functions accumulate as partial results. At some point before the algorithm returns, the partial results are combined, two at a time, using the std::min_of{} operation until a single value remains, which is then assigned back to the live-out object.

T shall meet the requirements of CopyConstructible and MoveAssignable.

Note

In order to produce useful results, modifications to the view should be limited to commutative operations closely related to the combiner operation.

Template Parameters
  • T – The value type to be used by the induction object.

  • Op – The type of the binary function (object) used to perform the reduction operation.

Parameters
  • var – [in,out] The life-out value to use for the reduction object. This will hold the reduced value after the algorithm is finished executing.

  • identity – [in] The identity value to use for the reduction operation. This argument is optional and defaults to a copy of var.

Returns

This returns a reduction object of unspecified type having a value type of T. When the return value is used by an algorithm, the reference to var is used as the live-out object, new views are initialized to a copy of identity, and views are combined by invoking the copy of combiner, passing it the two views to be combined.

template<typename T> constexpr HPX_CXX_EXPORT hpx::parallel::detail::reduction_helper< T, hpx::parallel::detail::min_of< T > > reduction_min (T &var, T const &identity)
hpx::experimental::reduction_multiplies#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace experimental

Top-level namespace.

Functions

template<typename T> constexpr HPX_CXX_EXPORT hpx::parallel::detail::reduction_helper< T, std::multiplies< T > > reduction_multiplies (T &var)

The function template reduction_multiplies returns a reduction object of unspecified type having a value type and encapsulating an identity value for the reduction, it uses std::multiplies{} as its combiner function, and a live-out object from which the initial value is obtained and into which the final value is stored.

A parallel algorithm uses the reduction_multiplies object by allocating an unspecified number of instances, called views, of the reduction’s value type. Each view is initialized with the reduction object’s identity value, except that the live-out object (which was initialized by the caller) comprises one of the views. The algorithm passes a reference to a view to each application of an element-access function, ensuring that no two concurrently-executing invocations share the same view. A view can be shared between two applications that do not execute concurrently, but initialization is performed only once per view.

Modifications to the view by the application of element access functions accumulate as partial results. At some point before the algorithm returns, the partial results are combined, two at a time, using the std::multiplies{} operation until a single value remains, which is then assigned back to the live-out object.

T shall meet the requirements of CopyConstructible and MoveAssignable.

Note

In order to produce useful results, modifications to the view should be limited to commutative operations closely related to the combiner operation.

Template Parameters
  • T – The value type to be used by the induction object.

  • Op – The type of the binary function (object) used to perform the reduction operation.

Parameters
  • var – [in,out] The life-out value to use for the reduction object. This will hold the reduced value after the algorithm is finished executing.

  • identity – [in] The identity value to use for the reduction operation. This argument is optional and defaults to a copy of var.

Returns

This returns a reduction object of unspecified type having a value type of T. When the return value is used by an algorithm, the reference to var is used as the live-out object, new views are initialized to a copy of identity, and views are combined by invoking the copy of combiner, passing it the two views to be combined.

template<typename T> constexpr HPX_CXX_EXPORT hpx::parallel::detail::reduction_helper< T, std::multiplies< T > > reduction_multiplies (T &var, T const &identity)
hpx::experimental::reduction_plus#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace experimental

Top-level namespace.

Functions

template<typename T> constexpr HPX_CXX_EXPORT hpx::parallel::detail::reduction_helper< T, std::plus< T > > reduction_plus (T &var)

The function template reduction_plus returns a reduction object of unspecified type having a value type and encapsulating an identity value for the reduction, it uses std::plus{} as its combiner function, and a live-out object from which the initial value is obtained and into which the final value is stored.

A parallel algorithm uses the reduction_plus object by allocating an unspecified number of instances, called views, of the reduction’s value type. Each view is initialized with the reduction object’s identity value, except that the live-out object (which was initialized by the caller) comprises one of the views. The algorithm passes a reference to a view to each application of an element-access function, ensuring that no two concurrently-executing invocations share the same view. A view can be shared between two applications that do not execute concurrently, but initialization is performed only once per view.

Modifications to the view by the application of element access functions accumulate as partial results. At some point before the algorithm returns, the partial results are combined, two at a time, using the std::plus{} operation until a single value remains, which is then assigned back to the live-out object.

T shall meet the requirements of CopyConstructible and MoveAssignable.

Note

In order to produce useful results, modifications to the view should be limited to commutative operations closely related to the combiner operation.

Template Parameters
  • T – The value type to be used by the induction object.

  • Op – The type of the binary function (object) used to perform the reduction operation.

Parameters
  • var – [in,out] The life-out value to use for the reduction object. This will hold the reduced value after the algorithm is finished executing.

  • identity – [in] The identity value to use for the reduction operation. This argument is optional and defaults to a copy of var.

Returns

This returns a reduction object of unspecified type having a value type of T. When the return value is used by an algorithm, the reference to var is used as the live-out object, new views are initialized to a copy of identity, and views are combined by invoking the copy of combiner, passing it the two views to be combined.

template<typename T> constexpr HPX_CXX_EXPORT hpx::parallel::detail::reduction_helper< T, std::plus< T > > reduction_plus (T &var, T const &identity)
hpx::generate, hpx::generate_n#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename ExPolicy, typename FwdIter, typename F>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter> generate(ExPolicy &&policy, FwdIter first, FwdIter last, F &&f)#

Assign each element in range [first, last) a value generated by the given function object f. Executed according to the policy.

The assignments in the parallel generate algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel generate algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Exactly distance(first, last) invocations of f and assignments.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of equal requires F to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – generator function that will be called. signature of function should be equivalent to the following:

    Ret fun();
    
    The type Ret must be such that an object of type FwdIter can be dereferenced and assigned a value of type Ret.

Returns

The generate algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise.

template<typename FwdIter, typename F>
FwdIter generate(FwdIter first, FwdIter last, F &&f)#

Assign each element in range [first, last) a value generated by the given function object f.

Note

Complexity: Exactly distance(first, last) invocations of f and assignments.

Template Parameters
  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of equal requires F to meet the requirements of CopyConstructible.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – generator function that will be called. signature of function should be equivalent to the following:

    Ret fun();
    
    The type Ret must be such that an object of type FwdIter can be dereferenced and assigned a value of type Ret.

Returns

The generate algorithm returns a FwdIter.

template<typename ExPolicy, typename FwdIter, typename Size, typename F>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter> generate_n(ExPolicy &&policy, FwdIter first, Size count, F &&f)#

Assigns each element in range [first, first+) a value generated by the given function object f. Executed according to the policy.

The assignments in the parallel generate_n algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel generate_n algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Exactly count invocations of f and assignments, for count > 0.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • Size – The type of a non-negative integral value specifying the number of items to iterate over.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of equal requires F to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements in the sequence the algorithm will be applied to.

  • f – Refers to the generator function object that will be called. The signature of the function should be equivalent to

    Ret fun();
    
    The type Ret must be such that an object of type OutputIt can be dereferenced and assigned a value of type Ret.

Returns

The generate_n algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. generate_n algorithm returns iterator one past the last element assigned if count>0, first otherwise.

template<typename FwdIter, typename Size, typename F>
FwdIter generate_n(FwdIter first, Size count, F &&f)#

Assigns each element in range [first, first+) a value generated by the given function object f.

Note

Complexity: Exactly count invocations of f and assignments, for count > 0.

Template Parameters
  • Size – The type of a non-negative integral value specifying the number of items to iterate over.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of equal requires F to meet the requirements of CopyConstructible.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements in the sequence the algorithm will be applied to.

  • f – Refers to the generator function object that will be called. The signature of the function should be equivalent to

    Ret fun();
    
    The type Ret must be such that an object of type OutputIt can be dereferenced and assigned a value of type Ret.

Returns

The generate_n algorithm returns a FwdIter. generate_n algorithm returns iterator one past the last element assigned if count>0, first otherwise.

hpx::includes#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename Pred = hpx::parallel::detail::less>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, bool>::type includes(ExPolicy &&policy, FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2, Pred &&op = Pred())#

Returns true if every element from the sorted range [first2, last2) is found within the sorted range [first1, last1). Also returns true if [first2, last2) is empty. The version expects both ranges to be sorted with the user supplied binary predicate f. Executed according to the policy.

The comparison operations in the parallel includes algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel includes algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

At most 2*(N1+N2-1) comparisons, where N1 = std::distance(first1, last1) and N2 = std::distance(first2, last2).

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of includes requires Pred to meet the requirements of CopyConstructible. This defaults to std::less<>

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

  • op – The binary predicate which returns true if the elements should be treated as includes. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively

Returns

The includes algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The includes algorithm returns true every element from the sorted range [first2, last2) is found within the sorted range [first1, last1). Also returns true if [first2, last2) is empty.

template<typename FwdIter1, typename FwdIter2, typename Pred = hpx::parallel::detail::less>
bool includes(FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2, Pred &&op = Pred())#

Returns true if every element from the sorted range [first2, last2) is found within the sorted range [first1, last1). Also returns true if [first2, last2) is empty. The version expects both ranges to be sorted with the user supplied binary predicate f.

Note

At most 2*(N1+N2-1) comparisons, where N1 = std::distance(first1, last1) and N2 = std::distance(first2, last2).

Template Parameters
  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of includes requires Pred to meet the requirements of CopyConstructible. This defaults to std::less<>

Parameters
  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

  • op – The binary predicate which returns true if the elements should be treated as includes. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively

Returns

The includes algorithm returns a bool. The includes algorithm returns true every element from the sorted range [first2, last2) is found within the sorted range [first1, last1). Also returns true if [first2, last2) is empty.

hpx::inclusive_scan#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename InIter, typename OutIter>
OutIter inclusive_scan(InIter first, InIter last, OutIter dest)#

Assigns through each iterator i in [result, result + (last - first)) the value of GENERALIZED_NONCOMMUTATIVE_SUM(+, *first, …, *(first + (i - result))).

The reduce operations in the parallel inclusive_scan algorithm invoked without an execution policy object will execute in sequential order in the calling thread.

The difference between exclusive_scan and inclusive_scan is that inclusive_scan includes the ith input element in the ith sum.

Note

Complexity: O(last - first) applications of the predicate op, here std::plus<>().

Note

GENERALIZED_NONCOMMUTATIVE_SUM(+, a1, …, aN) is defined as:

  • a1 when N is 1

  • GENERALIZED_NONCOMMUTATIVE_SUM(+, a1, …, aK)

    • GENERALIZED_NONCOMMUTATIVE_SUM(+, aM, …, aN) where 1 < K+1 = M <= N.

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • OutIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an output iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The inclusive_scan algorithm returns OutIter. The inclusive_scan algorithm returns the output iterator to the element in the destination range, one past the last element copied.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> inclusive_scan(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest)#

Assigns through each iterator i in [result, result + (last - first)) the value of GENERALIZED_NONCOMMUTATIVE_SUM(+, *first, …, *(first + (i - result))). Executed according to the policy.

The reduce operations in the parallel inclusive_scan algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The reduce operations in the parallel inclusive_scan algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

The difference between exclusive_scan and inclusive_scan is that inclusive_scan includes the ith input element in the ith sum.

Note

Complexity: O(last - first) applications of the predicate op, here std::plus<>().

Note

GENERALIZED_NONCOMMUTATIVE_SUM(+, a1, …, aN) is defined as:

  • a1 when N is 1

  • GENERALIZED_NONCOMMUTATIVE_SUM(+, a1, …, aK)

    • GENERALIZED_NONCOMMUTATIVE_SUM(+, aM, …, aN) where 1 < K+1 = M <= N.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The inclusive_scan algorithm returns a hpx::future<FwdIter2> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter2 otherwise. The inclusive_scan algorithm returns the output iterator to the element in the destination range, one past the last element copied.

template<typename InIter, typename OutIter, typename Op>
OutIter inclusive_scan(InIter first, InIter last, OutIter dest, Op &&op)#

Assigns through each iterator i in [result, result + (last - first)) the value of GENERALIZED_NONCOMMUTATIVE_SUM(op, *first, …, *(first + (i - result))).

The reduce operations in the parallel inclusive_scan algorithm invoked without an execution policy object will execute in sequential order in the calling thread.

The difference between exclusive_scan and inclusive_scan is that inclusive_scan includes the ith input element in the ith sum.

Note

Complexity: O(last - first) applications of the predicate op.

Note

GENERALIZED_NONCOMMUTATIVE_SUM(+, a1, …, aN) is defined as:

  • a1 when N is 1

  • GENERALIZED_NONCOMMUTATIVE_SUM(+, a1, …, aK)

    • GENERALIZED_NONCOMMUTATIVE_SUM(+, aM, …, aN) where 1 < K+1 = M <= N.

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • OutIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an output iterator.

  • Op – The type of the binary function object used for the reduction operation.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • op – Specifies the function (or function object) which will be invoked for each of the values of the input sequence. This is a binary predicate. The signature of this predicate should be equivalent to:

    Ret fun(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The types Type1 and Ret must be such that an object of a type as given by the input sequence can be implicitly converted to any of those types.

Returns

The inclusive_scan algorithm returns OutIter. The inclusive_scan algorithm returns the output iterator to the element in the destination range, one past the last element copied.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename Op>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> inclusive_scan(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest, Op &&op)#

Assigns through each iterator i in [result, result + (last - first)) the value of GENERALIZED_NONCOMMUTATIVE_SUM(op, *first, …, *(first + (i - result))). Executed according to the policy.

The reduce operations in the parallel inclusive_scan algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The reduce operations in the parallel inclusive_scan algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

The difference between exclusive_scan and inclusive_scan is that inclusive_scan includes the ith input element in the ith sum.

Note

Complexity: O(last - first) applications of the predicate op.

Note

GENERALIZED_NONCOMMUTATIVE_SUM(+, a1, …, aN) is defined as:

  • a1 when N is 1

  • GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aK)

    • GENERALIZED_NONCOMMUTATIVE_SUM(+, aM, …, aN) where 1 < K+1 = M <= N.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Op – The type of the binary function object used for the reduction operation.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • op – Specifies the function (or function object) which will be invoked for each of the values of the input sequence. This is a binary predicate. The signature of this predicate should be equivalent to:

    Ret fun(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The types Type1 and Ret must be such that an object of a type as given by the input sequence can be implicitly converted to any of those types.

Returns

The inclusive_scan algorithm returns a hpx::future<FwdIter2> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter2 otherwise. The inclusive_scan algorithm returns the output iterator to the element in the destination range, one past the last element copied.

template<typename InIter, typename OutIter, typename Op, typename T>
OutIter inclusive_scan(InIter first, InIter last, OutIter dest, Op &&op, T init)#

Assigns through each iterator i in [result, result + (last - first)) the value of GENERALIZED_NONCOMMUTATIVE_SUM(op, init, *first, …, *(first + (i - result))).

The reduce operations in the parallel inclusive_scan algorithm invoked without an execution policy object will execute in sequential order in the calling thread.

The difference between exclusive_scan and inclusive_scan is that inclusive_scan includes the ith input element in the ith sum. If op is not mathematically associative, the behavior of inclusive_scan may be non-deterministic.

Note

Complexity: O(last - first) applications of the predicate op.

Note

GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aN) is defined as:

  • a1 when N is 1

  • op(GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aK), GENERALIZED_NONCOMMUTATIVE_SUM(op, aM, …, aN)) where 1 < K+1 = M <= N.

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • OutIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an output iterator.

  • Op – The type of the binary function object used for the reduction operation.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • op – Specifies the function (or function object) which will be invoked for each of the values of the input sequence. This is a binary predicate. The signature of this predicate should be equivalent to:

    Ret fun(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The types Type1 and Ret must be such that an object of a type as given by the input sequence can be implicitly converted to any of those types.

  • init – The initial value for the generalized sum.

Returns

The inclusive_scan algorithm returns OutIter. The inclusive_scan algorithm returns the output iterator to the element in the destination range, one past the last element copied.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename Op, typename T>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> inclusive_scan(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest, Op &&op, T init)#

Assigns through each iterator i in [result, result + (last - first)) the value of GENERALIZED_NONCOMMUTATIVE_SUM(op, init, *first, …, *(first + (i - result))). Executed according to the policy.

The reduce operations in the parallel inclusive_scan algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The reduce operations in the parallel inclusive_scan algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

The difference between exclusive_scan and inclusive_scan is that inclusive_scan includes the ith input element in the ith sum. If op is not mathematically associative, the behavior of inclusive_scan may be non-deterministic.

Note

Complexity: O(last - first) applications of the predicate op.

Note

GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aN) is defined as:

  • a1 when N is 1

  • op(GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aK), GENERALIZED_NONCOMMUTATIVE_SUM(op, aM, …, aN)) where 1 < K+1 = M <= N.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Op – The type of the binary function object used for the reduction operation.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • op – Specifies the function (or function object) which will be invoked for each of the values of the input sequence. This is a binary predicate. The signature of this predicate should be equivalent to:

    Ret fun(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The types Type1 and Ret must be such that an object of a type as given by the input sequence can be implicitly converted to any of those types.

  • init – The initial value for the generalized sum.

Returns

The inclusive_scan algorithm returns a hpx::future<FwdIter2> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter2 otherwise. The inclusive_scan algorithm returns the output iterator to the element in the destination range, one past the last element copied.

hpx::is_heap, hpx::is_heap_until#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename ExPolicy, typename RandIter, typename Comp = hpx::parallel::detail::less>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, bool> is_heap(ExPolicy &&policy, RandIter first, RandIter last, Comp &&comp = Comp())#

Returns whether the range is max heap. That is, true if the range is max heap, false otherwise. The function uses the given comparison function object comp (defaults to using operator<()). Executed according to the policy.

comp has to induce a strict weak ordering on the values.

The application of function objects in parallel algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Linear in the distance between first and last.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • RandIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a random access iterator.

  • Comp – The type of the function/function object to use (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • compcomp is a callable object. The return value of the INVOKE operation applied to an object of type Comp, when contextually converted to bool, yields true if the first argument of the call is less than the second, and false otherwise. It is assumed that comp will not apply any non-constant function through the dereferenced iterator.

Returns

The is_heap algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The is_heap algorithm returns whether the range is max heap. That is, true if the range is max heap, false otherwise.

template<typename RandIter, typename Comp = hpx::parallel::detail::less>
bool is_heap(RandIter first, RandIter last, Comp &&comp = Comp())#

Returns whether the range is max heap. That is, true if the range is max heap, false otherwise. The function uses the given comparison function object comp (defaults to using operator<()).

comp has to induce a strict weak ordering on the values.

Note

Complexity: Linear in the distance between first and last.

Template Parameters
  • RandIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a random access iterator.

  • Comp – The type of the function/function object to use (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • compcomp is a callable object. The return value of the INVOKE operation applied to an object of type Comp, when contextually converted to bool, yields true if the first argument of the call is less than the second, and false otherwise. It is assumed that comp will not apply any non-constant function through the dereferenced iterator.

Returns

The is_heap a bool. The is_heap algorithm returns whether the range is max heap. That is, true if the range is max heap, false otherwise.

template<typename ExPolicy, typename RandIter, typename Comp = hpx::parallel::detail::less>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, RandIter> is_heap_until(ExPolicy &&policy, RandIter first, RandIter last, Comp &&comp = Comp())#

Returns the upper bound of the largest range beginning at first which is a max heap. That is, the last iterator it for which range [first, it) is a max heap. The function uses the given comparison function object comp (defaults to using operator<()). Executed according to the policy.

comp has to induce a strict weak ordering on the values.

The application of function objects in parallel algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Linear in the distance between first and last.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • RandIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a random access iterator.

  • Comp – The type of the function/function object to use (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • compcomp is a callable object. The return value of the INVOKE operation applied to an object of type Comp, when contextually converted to bool, yields true if the first argument of the call is less than the second, and false otherwise. It is assumed that comp will not apply any non-constant function through the dereferenced iterator.

Returns

The is_heap_until algorithm returns a hpx::future<RandIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns RandIter otherwise. The is_heap_until algorithm returns the upper bound of the largest range beginning at first which is a max heap. That is, the last iterator it for which range [first, it) is a max heap.

template<typename RandIter, typename Comp = hpx::parallel::detail::less>
RandIter is_heap_until(RandIter first, RandIter last, Comp &&comp = Comp())#

Returns the upper bound of the largest range beginning at first which is a max heap. That is, the last iterator it for which range [first, it) is a max heap. The function uses the given comparison function object comp (defaults to using operator<()).

comp has to induce a strict weak ordering on the values.

Note

Complexity: Linear in the distance between first and last.

Template Parameters
  • RandIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a random access iterator.

  • Comp – The type of the function/function object to use (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • compcomp is a callable object. The return value of the INVOKE operation applied to an object of type Comp, when contextually converted to bool, yields true if the first argument of the call is less than the second, and false otherwise. It is assumed that comp will not apply any non-constant function through the dereferenced iterator.

Returns

The is_heap_until algorithm returns a RandIter. The is_heap_until algorithm returns the upper bound of the largest range beginning at first which is a max heap. That is, the last iterator it for which range [first, it) is a max heap.

hpx::is_partitioned#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename FwdIter, typename Pred>
bool is_partitioned(FwdIter first, FwdIter last, Pred &&pred)#

Determines if the range [first, last) is partitioned.

Note

Complexity: at most (N) predicate evaluations where N = distance(first, last).

Template Parameters
  • FwdIter – The type of the source iterators used for the This iterator type must meet the requirements of a forward iterator.

  • Pred – The type of the function/function object to use (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements of that the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of that the algorithm will be applied to.

  • pred – Refers to the unary predicate which returns true for elements expected to be found in the beginning of the range. The signature of the function should be equivalent to

    bool pred(const Type &a);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type must be such that objects of types FwdIter can be dereferenced and then implicitly converted to Type.

Returns

The is_partitioned algorithm returns bool. The is_partitioned algorithm returns true if each element in the sequence for which pred returns true precedes those for which pred returns false. Otherwise is_partitioned returns false. If the range [first, last) contains less than two elements, the function is always true.

template<typename ExPolicy, typename FwdIter, typename Pred>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, bool> is_partitioned(ExPolicy &&policy, FwdIter first, FwdIter last, Pred &&pred)#

Determines if the range [first, last) is partitioned. Executed according to the policy.

The predicate operations in the parallel is_partitioned algorithm invoked with an execution policy object of type sequenced_policy executes in sequential order in the calling thread.

The comparison operations in the parallel is_partitioned algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: at most (N) predicate evaluations where N = distance(first, last).

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used for the This iterator type must meet the requirements of a forward iterator.

  • Pred – The type of the function/function object to use (deduced). Pred must be CopyConstructible when using a parallel policy.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of that the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of that the algorithm will be applied to.

  • pred – Refers to the unary predicate which returns true for elements expected to be found in the beginning of the range. The signature of the function should be equivalent to

    bool pred(const Type &a);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type must be such that objects of types FwdIter can be dereferenced and then implicitly converted to Type.

Returns

The is_partitioned algorithm returns a hpx::future<bool> if the execution policy is of type task_execution_policy and returns bool otherwise. The is_partitioned algorithm returns true if each element in the sequence for which pred returns true precedes those for which pred returns false. Otherwise is_partitioned returns false. If the range [first, last) contains less than two elements, the function is always true.

hpx::is_sorted, hpx::is_sorted_until#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename FwdIter, typename Pred = hpx::parallel::detail::less>
bool is_sorted(FwdIter first, FwdIter last, Pred &&pred = Pred())#

Determines if the range [first, last) is sorted. Uses pred to compare elements.

The comparison operations in the parallel is_sorted algorithm executes in sequential order in the calling thread.

Note

Complexity: at most (N+S-1) comparisons where N = distance(first, last). S = number of partitions

Template Parameters
  • FwdIter – The type of the source iterators used for the This iterator type must meet the requirements of a forward iterator.

  • Pred – The type of an optional function/function object to use.

Parameters
  • first – Refers to the beginning of the sequence of elements of that the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of that the algorithm will be applied to.

  • pred – Refers to the binary predicate which returns true if the first argument should be treated as less than the second argument. The signature of the function should be equivalent to

    bool pred(const Type &a, const Type &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type must be such that objects of types FwdIter can be dereferenced and then implicitly converted to Type.

Returns

The is_sorted algorithm returns a bool. The is_sorted algorithm returns true if each element in the sequence [first, last) satisfies the predicate passed. If the range [first, last) contains less than two elements, the function always returns true.

template<typename ExPolicy, typename FwdIter, typename Pred = hpx::parallel::detail::less>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, bool> is_sorted(ExPolicy &&policy, FwdIter first, FwdIter last, Pred &&pred = Pred())#

Determines if the range [first, last) is sorted. Uses pred to compare elements. Executed according to the policy.

The comparison operations in the parallel is_sorted algorithm invoked with an execution policy object of type sequenced_policy executes in sequential order in the calling thread.

The comparison operations in the parallel is_sorted algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: at most (N+S-1) comparisons where N = distance(first, last). S = number of partitions

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used for the This iterator type must meet the requirements of a forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of is_sorted requires Pred to meet the requirements of CopyConstructible. This defaults to std::less<>

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of that the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of that the algorithm will be applied to.

  • pred – Refers to the binary predicate which returns true if the first argument should be treated as less than the second argument. The signature of the function should be equivalent to

    bool pred(const Type &a, const Type &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type must be such that objects of types FwdIter can be dereferenced and then implicitly converted to Type.

Returns

The is_sorted algorithm returns a hpx::future<bool> if the execution policy is of type task_execution_policy and returns bool otherwise. The is_sorted algorithm returns a bool if each element in the sequence [first, last) satisfies the predicate passed. If the range [first, last) contains less than two elements, the function always returns true.

template<typename FwdIter, typename Pred = hpx::parallel::detail::less>
FwdIter is_sorted_until(FwdIter first, FwdIter last, Pred &&pred = Pred())#

Returns the first element in the range [first, last) that is not sorted. Uses a predicate to compare elements or the less than operator.

The comparison operations in the parallel is_sorted_until algorithm execute in sequential order in the calling thread.

Note

Complexity: at most (N+S-1) comparisons where N = distance(first, last). S = number of partitions

Template Parameters
  • FwdIter – The type of the source iterators used for the This iterator type must meet the requirements of a forward iterator.

  • Pred – The type of an optional function/function object to use.

Parameters
  • first – Refers to the beginning of the sequence of elements of that the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of that the algorithm will be applied to.

  • pred – Refers to the binary predicate which returns true if the first argument should be treated as less than the second argument. The signature of the function should be equivalent to

    bool pred(const Type &a, const Type &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type must be such that objects of types FwdIter can be dereferenced and then implicitly converted to Type.

Returns

The is_sorted_until algorithm returns a FwdIter. The is_sorted_until algorithm returns the first unsorted element. If the sequence has less than two elements or the sequence is sorted, last is returned.

template<typename ExPolicy, typename FwdIter, typename Pred = hpx::parallel::detail::less>
hpx::parallel::util::detail::algorithm_result<ExPolicy, FwdIter>::type is_sorted_until(ExPolicy &&policy, FwdIter first, FwdIter last, Pred &&pred = Pred())#

Returns the first element in the range [first, last) that is not sorted. Uses a predicate to compare elements or the less than operator. Executed according to the policy.

The comparison operations in the parallel is_sorted_until algorithm invoked with an execution policy object of type sequenced_policy executes in sequential order in the calling thread.

The comparison operations in the parallel is_sorted_until algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: at most (N+S-1) comparisons where N = distance(first, last). S = number of partitions

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used for the This iterator type must meet the requirements of a forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of is_sorted_until requires Pred to meet the requirements of CopyConstructible. This defaults to std::less<>

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of that the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of that the algorithm will be applied to.

  • pred – Refers to the binary predicate which returns true if the first argument should be treated as less than the second argument. The signature of the function should be equivalent to

    bool pred(const Type &a, const Type &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type must be such that objects of types FwdIter can be dereferenced and then implicitly converted to Type.

Returns

The is_sorted_until algorithm returns a hpx::future<FwdIter> if the execution policy is of type task_execution_policy and returns FwdIter otherwise. The is_sorted_until algorithm returns the first unsorted element. If the sequence has less than two elements or the sequence is sorted, last is returned.

hpx::lexicographical_compare#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename InIter1, typename InIter2, typename Pred = hpx::parallel::detail::less>
bool lexicographical_compare(InIter1 first1, InIter1 last1, InIter2 first2, InIter2 last2, Pred &&pred)#

Checks if the first range [first1, last1) is lexicographically less than the second range [first2, last2). uses a provided predicate to compare elements.

The comparison operations in the parallel lexicographical_compare algorithm invoked without an execution policy object execute in sequential order in the calling thread.

Note

Complexity: At most 2 * min(N1, N2) applications of the comparison operation, where N1 = std::distance(first1, last) and N2 = std::distance(first2, last2).

Note

Lexicographical comparison is an operation with the following properties

  • Two ranges are compared element by element

  • The first mismatching element defines which range is lexicographically less or greater than the other

  • If one range is a prefix of another, the shorter range is lexicographically less than the other

  • If two ranges have equivalent elements and are of the same length, then the ranges are lexicographically equal

  • An empty range is lexicographically less than any non-empty range

  • Two empty ranges are lexicographically equal

Template Parameters
  • InIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an input iterator.

  • InIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an input iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of lexicographical_compare requires Pred to meet the requirements of CopyConstructible. This defaults to std::less<>

Parameters
  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

  • pred – Refers to the comparison function that the first and second ranges will be applied to

Returns

The lexicographically_compare algorithm returns a returns bool if the execution policy object is not passed in. The lexicographically_compare algorithm returns true if the first range is lexicographically less, otherwise it returns false. range [first2, last2), it returns false.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename Pred = hpx::parallel::detail::less>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, bool> lexicographical_compare(ExPolicy &&policy, FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2, Pred &&pred)#

Checks if the first range [first1, last1) is lexicographically less than the second range [first2, last2). uses a provided predicate to compare elements.

The comparison operations in the parallel lexicographical_compare algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel lexicographical_compare algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most 2 * min(N1, N2) applications of the comparison operation, where N1 = std::distance(first1, last) and N2 = std::distance(first2, last2).

Note

Lexicographical comparison is an operation with the following properties

  • Two ranges are compared element by element

  • The first mismatching element defines which range is lexicographically less or greater than the other

  • If one range is a prefix of another, the shorter range is lexicographically less than the other

  • If two ranges have equivalent elements and are of the same length, then the ranges are lexicographically equal

  • An empty range is lexicographically less than any non-empty range

  • Two empty ranges are lexicographically equal

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of lexicographical_compare requires Pred to meet the requirements of CopyConstructible. This defaults to std::less<>

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

  • pred – Refers to the comparison function that the first and second ranges will be applied to

Returns

The lexicographically_compare algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The lexicographically_compare algorithm returns true if the first range is lexicographically less, otherwise it returns false. range [first2, last2), it returns false.

hpx::make_heap#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename ExPolicy, typename RndIter, typename Comp>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy> make_heap(ExPolicy &&policy, RndIter first, RndIter last, Comp &&comp)#

Constructs a max heap in the range [first, last). Executed according to the policy.

The predicate operations in the parallel make_heap algorithm invoked with an execution policy object of type sequential_execution_policy executes in sequential order in the calling thread.

The comparison operations in the parallel make_heap algorithm invoked with an execution policy object of type parallel_execution_policy or parallel_task_execution_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: at most (3*N) comparisons where N = distance(first, last).

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • RndIter – The type of the source iterators used for algorithm. This iterator must meet the requirements for a random access iterator.

  • Comp – Comparison function object (i.e. an object that satisfies the requirements of Compare) which returns true if the first argument is less than the second. The signature of the comparison function should be equivalent to the following:

    bool cmp(const Type1 &a, const Type2 &b);
    
    While the signature does not need to have const &, the function must not modify the objects passed to it and must be able to accept all values of type (possibly const) Type1 and Type2 regardless of value category (thus, Type1 & is not allowed, nor is Type1 unless for Type1 a move is equivalent to a copy. The types Type1 and Type2 must be such that an object of type RandomIt can be dereferenced and then implicitly converted to both of them.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of that the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of that the algorithm will be applied to.

  • comp – Refers to the binary predicate which returns true if the first argument should be treated as less than the second. The signature of the function should be equivalent to

    bool comp(const Type &a, const Type &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type must be such that objects of types RndIter can be dereferenced and then implicitly converted to Type.

Returns

The make_heap algorithm returns a hpx::future<void> if the execution policy is of type task_execution_policy and returns void otherwise.

template<typename ExPolicy, typename RndIter>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy> make_heap(ExPolicy &&policy, RndIter first, RndIter last)#

Constructs a max heap in the range [first, last). Uses the operator < for comparisons. Executed according to the policy.

The predicate operations in the parallel make_heap algorithm invoked with an execution policy object of type sequential_execution_policy executes in sequential order in the calling thread.

The comparison operations in the parallel make_heap algorithm invoked with an execution policy object of type parallel_execution_policy or parallel_task_execution_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: at most (3*N) comparisons where N = distance(first, last).

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • RndIter – The type of the source iterators used for algorithm. This iterator must meet the requirements for a random access iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of that the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of that the algorithm will be applied to.

Returns

The make_heap algorithm returns a hpx::future<void> if the execution policy is of type task_execution_policy and returns void otherwise.

template<typename RndIter, typename Comp>
void make_heap(RndIter first, RndIter last, Comp &&comp)#

Constructs a max heap in the range [first, last).

Note

Complexity: at most (3*N) comparisons where N = distance(first, last).

Template Parameters
  • RndIter – The type of the source iterators used for algorithm. This iterator must meet the requirements for a random access iterator.

  • Comp – Comparison function object (i.e. an object that satisfies the requirements of Compare) which returns true if the first argument is less than the second. The signature of the comparison function should be equivalent to the following:

    bool cmp(const Type1 &a, const Type2 &b);
    
    While the signature does not need to have const &, the function must not modify the objects passed to it and must be able to accept all values of type (possibly const) Type1 and Type2 regardless of value category (thus, Type1 & is not allowed, nor is Type1 unless for Type1 a move is equivalent to a copy. The types Type1 and Type2 must be such that an object of type RandomIt can be dereferenced and then implicitly converted to both of them.

Parameters
  • first – Refers to the beginning of the sequence of elements of that the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of that the algorithm will be applied to.

  • comp – Refers to the binary predicate which returns true if the first argument should be treated as less than the second. The signature of the function should be equivalent to

    bool comp(const Type &a, const Type &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type must be such that objects of types RndIter can be dereferenced and then implicitly converted to Type.

Returns

The make_heap algorithm returns a void.

template<typename RndIter>
void make_heap(RndIter first, RndIter last)#

Constructs a max heap in the range [first, last).

Note

Complexity: at most (3*N) comparisons where N = distance(first, last).

Template Parameters

RndIter – The type of the source iterators used for algorithm. This iterator must meet the requirements for a random access iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements of that the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of that the algorithm will be applied to.

Returns

The make_heap algorithm returns a void.

hpx::merge, hpx::inplace_merge#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename ExPolicy, typename RandIter1, typename RandIter2, typename RandIter3, typename Comp = hpx::parallel::detail::less>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, RandIter3> merge(ExPolicy &&policy, RandIter1 first1, RandIter1 last1, RandIter2 first2, RandIter2 last2, RandIter3 dest, Comp &&comp = Comp())#

Merges two sorted ranges [first1, last1) and [first2, last2) into one sorted range beginning at dest. The order of equivalent elements in each of the original two ranges is preserved. For equivalent elements in the original two ranges, the elements from the first range precede the elements from the second range. The destination range cannot overlap with either of the input ranges. Executed according to the policy.

The assignments in the parallel merge algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel merge algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs O(std::distance(first1, last1) + std::distance(first2, last2)) applications of the comparison comp and each projection.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • RandIter1 – The type of the source iterators used (deduced) representing the first sorted range. This iterator type must meet the requirements of a random access iterator.

  • RandIter2 – The type of the source iterators used (deduced) representing the second sorted range. This iterator type must meet the requirements of a random access iterator.

  • RandIter3 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a random access iterator.

  • Comp – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of merge requires Comp to meet the requirements of CopyConstructible. This defaults to std::less<>

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the first range of elements the algorithm will be applied to.

  • last1 – Refers to the end of the first range of elements the algorithm will be applied to.

  • first2 – Refers to the beginning of the second range of elements the algorithm will be applied to.

  • last2 – Refers to the end of the second range of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • compcomp is a callable object which returns true if the first argument is less than the second, and false otherwise. The signature of this comparison should be equivalent to:

    bool comp(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types RandIter1 and RandIter2 can be dereferenced and then implicitly converted to both Type1 and Type2

Returns

The merge algorithm returns a hpx::future<RandIter3> > if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns RandIter3 otherwise. The merge algorithm returns the destination iterator to the end of the dest range.

template<typename RandIter1, typename RandIter2, typename RandIter3, typename Comp = hpx::parallel::detail::less>
RandIter3 merge(RandIter1 first1, RandIter1 last1, RandIter2 first2, RandIter2 last2, RandIter3 dest, Comp &&comp = Comp())#

Merges two sorted ranges [first1, last1) and [first2, last2) into one sorted range beginning at dest. The order of equivalent elements in each of the original two ranges is preserved. For equivalent elements in the original two ranges, the elements from the first range precede the elements from the second range. The destination range cannot overlap with either of the input ranges.

Note

Complexity: Performs O(std::distance(first1, last1) + std::distance(first2, last2)) applications of the comparison comp and each projection.

Template Parameters
  • RandIter1 – The type of the source iterators used (deduced) representing the first sorted range. This iterator type must meet the requirements of a random access iterator.

  • RandIter2 – The type of the source iterators used (deduced) representing the second sorted range. This iterator type must meet the requirements of a random access iterator.

  • RandIter3 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a random access iterator.

  • Comp – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of merge requires Comp to meet the requirements of CopyConstructible. This defaults to std::less<>

Parameters
  • first1 – Refers to the beginning of the first range of elements the algorithm will be applied to.

  • last1 – Refers to the end of the first range of elements the algorithm will be applied to.

  • first2 – Refers to the beginning of the second range of elements the algorithm will be applied to.

  • last2 – Refers to the end of the second range of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • compcomp is a callable object which returns true if the first argument is less than the second, and false otherwise. The signature of this comparison should be equivalent to:

    bool comp(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types RandIter1 and RandIter2 can be dereferenced and then implicitly converted to both Type1 and Type2

Returns

The merge algorithm returns a RandIter3. The merge algorithm returns the destination iterator to the end of the dest range.

template<typename ExPolicy, typename RandIter, typename Comp = hpx::parallel::detail::less>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy> inplace_merge(ExPolicy &&policy, RandIter first, RandIter middle, RandIter last, Comp &&comp = Comp())#

Merges two consecutive sorted ranges [first, middle) and [middle, last) into one sorted range [first, last). The order of equivalent elements in each of the original two ranges is preserved. For equivalent elements in the original two ranges, the elements from the first range precede the elements from the second range. Executed according to the policy.

The assignments in the parallel inplace_merge algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel inplace_merge algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs O(std::distance(first, last)) applications of the comparison comp and each projection.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • RandIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a random access iterator.

  • Comp – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of inplace_merge requires Comp to meet the requirements of CopyConstructible. This defaults to std::less<>

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the first sorted range the algorithm will be applied to.

  • middle – Refers to the end of the first sorted range and the beginning of the second sorted range the algorithm will be applied to.

  • last – Refers to the end of the second sorted range the algorithm will be applied to.

  • compcomp is a callable object which returns true if the first argument is less than the second, and false otherwise. The signature of this comparison should be equivalent to:

    bool comp(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types RandIter can be dereferenced and then implicitly converted to both Type1 and Type2

Returns

The inplace_merge algorithm returns a hpx::future<void> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns void otherwise. The inplace_merge algorithm returns the source iterator last.

template<typename RandIter, typename Comp = hpx::parallel::detail::less>
void inplace_merge(RandIter first, RandIter middle, RandIter last, Comp &&comp = Comp())#

Merges two consecutive sorted ranges [first, middle) and [middle, last) into one sorted range [first, last). The order of equivalent elements in each of the original two ranges is preserved. For equivalent elements in the original two ranges, the elements from the first range precede the elements from the second range.

Note

Complexity: Performs O(std::distance(first, last)) applications of the comparison comp and each projection.

Template Parameters
  • RandIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a random access iterator.

  • Comp – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of inplace_merge requires Comp to meet the requirements of CopyConstructible. This defaults to std::less<>

Parameters
  • first – Refers to the beginning of the first sorted range the algorithm will be applied to.

  • middle – Refers to the end of the first sorted range and the beginning of the second sorted range the algorithm will be applied to.

  • last – Refers to the end of the second sorted range the algorithm will be applied to.

  • compcomp is a callable object which returns true if the first argument is less than the second, and false otherwise. The signature of this comparison should be equivalent to:

    bool comp(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types RandIter can be dereferenced and then implicitly converted to both Type1 and Type2

Returns

The inplace_merge algorithm returns a void. The inplace_merge algorithm returns the source iterator last.

hpx::min_element, hpx::max_element, hpx::minmax_element#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename FwdIter, typename F = hpx::parallel::detail::less>
FwdIter min_element(FwdIter first, FwdIter last, F &&f)#

Finds the smallest element in the range [first, last) using the given comparison function f.

The comparisons in the parallel min_element algorithm execute in sequential order in the calling thread.

Note

Complexity: Exactly max(N-1, 0) comparisons, where N = std::distance(first, last).

Template Parameters
  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • F – The type of the function/function object to use (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – The binary predicate which returns true if the the left argument is less than the right element. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1.

Returns

The min_element algorithm returns FwdIter. The min_element algorithm returns the iterator to the smallest element in the range [first, last). If several elements in the range are equivalent to the smallest element, returns the iterator to the first such element. Returns last if the range is empty.

template<typename ExPolicy, typename FwdIter, typename F = hpx::parallel::detail::less>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter> min_element(ExPolicy &&policy, FwdIter first, FwdIter last, F &&f)#

Finds the smallest element in the range [first, last) using the given comparison function f. Executed according to the policy.

The comparisons in the parallel min_element algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparisons in the parallel min_element algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Exactly max(N-1, 0) comparisons, where N = std::distance(first, last).

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of min_element requires F to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – The binary predicate which returns true if the the left argument is less than the right element. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1.

Returns

The min_element algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The min_element algorithm returns the iterator to the smallest element in the range [first, last). If several elements in the range are equivalent to the smallest element, returns the iterator to the first such element. Returns last if the range is empty.

template<typename FwdIter, typename F = hpx::parallel::detail::less>
FwdIter max_element(FwdIter first, FwdIter last, F &&f)#

Finds the largest element in the range [first, last) using the given comparison function f.

The comparisons in the parallel min_element algorithm execute in sequential order in the calling thread.

Note

Complexity: Exactly max(N-1, 0) comparisons, where N = std::distance(first, last).

Template Parameters
  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • F – The type of the function/function object to use (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – The binary predicate which returns true if the This argument is optional and defaults to std::less. the left argument is less than the right element. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1.

Returns

The max_element algorithm returns FwdIter. The max_element algorithm returns the iterator to the smallest element in the range [first, last). If several elements in the range are equivalent to the smallest element, returns the iterator to the first such element. Returns last if the range is empty.

template<typename ExPolicy, typename FwdIter, typename F = hpx::parallel::detail::less>
hpx::parallel::util::detail::algorithm_result<ExPolicy, FwdIter>::type max_element(ExPolicy &&policy, FwdIter first, FwdIter last, F &&f)#

Removes all elements satisfying specific criteria from the range Finds the largest element in the range [first, last) using the given comparison function f. Executed according to the policy.

The comparisons in the parallel max_element algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparisons in the parallel max_element algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Exactly max(N-1, 0) comparisons, where N = std::distance(first, last).

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of max_element requires F to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – The binary predicate which returns true if the This argument is optional and defaults to std::less. the left argument is less than the right element. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1.

Returns

The max_element algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The max_element algorithm returns the iterator to the smallest element in the range [first, last). If several elements in the range are equivalent to the smallest element, returns the iterator to the first such element. Returns last if the range is empty.

template<typename FwdIter, typename F = hpx::parallel::detail::less>
minmax_element_result<FwdIter> minmax_element(FwdIter first, FwdIter last, F &&f)#

Finds the largest element in the range [first, last) using the given comparison function f.

The comparisons in the parallel minmax_element algorithm execute in sequential order in the calling thread.

Note

Complexity: At most max(floor(3/2*(N-1)), 0) applications of the predicate, where N = std::distance(first, last).

Template Parameters
  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • F – The type of the function/function object to use (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – The binary predicate which returns true if the the left argument is less than the right element. This argument is optional and defaults to std::less. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1.

Returns

The minmax_element algorithm returns a minmax_element_result<FwdIter> The minmax_element algorithm returns a pair consisting of an iterator to the smallest element as the min element and an iterator to the largest element as the max element. Returns minmax_element_result<FwdIter>{first,first} if the range is empty. If several elements are equivalent to the smallest element, the iterator to the first such element is returned. If several elements are equivalent to the largest element, the iterator to the last such element is returned.

template<typename ExPolicy, typename FwdIter, typename F = hpx::parallel::detail::less>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, minmax_element_result<FwdIter>> minmax_element(ExPolicy &&policy, FwdIter first, FwdIter last, F &&f)#

Finds the largest element in the range [first, last) using the given comparison function f.

The comparisons in the parallel minmax_element algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparisons in the parallel minmax_element algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most max(floor(3/2*(N-1)), 0) applications of the predicate, where N = std::distance(first, last).

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of minmax_element requires F to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – The binary predicate which returns true if the the left argument is less than the right element. This argument is optional and defaults to std::less. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1.

Returns

The minmax_element algorithm returns a hpx::future<minmax_element_result<FwdIter>> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns minmax_element_result<FwdIter> otherwise. The minmax_element algorithm returns a pair consisting of an iterator to the smallest element as the min element and an iterator to the largest element as the max element. Returns std::make_pair(first,first) if the range is empty. If several elements are equivalent to the smallest element, the iterator to the first such element is returned. If several elements are equivalent to the largest element, the iterator to the last such element is returned.

hpx::mismatch#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename Pred>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, std::pair<FwdIter1, FwdIter2>> mismatch(ExPolicy &&policy, FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2, Pred &&op)#

Returns the first mismatching pair of elements from two ranges: one defined by [first1, last1) and another defined by [first2,last2). If last2 is not provided, it denotes first2 + (last1 - first1). Executed according to the policy.

The comparison operations in the parallel mismatch algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel mismatch algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most min(last1 - first1, last2 - first2) applications of the predicate op or operator==. If FwdIter1 and FwdIter2 meet the requirements of RandomAccessIterator and (last1 - first1) != (last2 - first2) then no applications of the predicate op or operator== are made.

Note

The two ranges are considered mismatch if, for every iterator i in the range [first1,last1), *i mismatches *(first2 + (i - first1)). This overload of mismatch uses operator== to determine if two elements are mismatch.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of mismatch requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

  • op – The binary predicate which returns true if the elements should be treated as mismatch. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively

Returns

The mismatch algorithm returns a hpx::future<std::pair<FwdIter1,FwdIter2>> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns std::pair<FwdIter1,FwdIter2> otherwise. If no mismatches are found when the comparison reaches last1 or last2, whichever happens first, the pair holds the end iterator and the corresponding iterator from the other range.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, std::pair<FwdIter1, FwdIter2>> mismatch(ExPolicy &&policy, FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2)#

Returns the first mismatching pair of elements from two ranges: one defined by [first1, last1) and another defined by [first2,last2). If last2 is not provided, it denotes first2 + (last1 - first1). Executed according to the policy.

The comparison operations in the parallel mismatch algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel mismatch algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most min(last1 - first1, last2 - first2) applications of operator==. If FwdIter1 and FwdIter2 meet the requirements of RandomAccessIterator and (last1 - first1) != (last2 - first2) then no applications of operator== are made.

Note

The two ranges are considered mismatch if, for every iterator i in the range [first1,last1), *i mismatches *(first2 + (i - first1)). This overload of mismatch uses operator== to determine if two elements are mismatch.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

Returns

The mismatch algorithm returns a hpx::future<std::pair<FwdIter1,FwdIter2>> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns std::pair<FwdIter1,FwdIter2> otherwise. If no mismatches are found when the comparison reaches last1 or last2, whichever happens first, the pair holds the end iterator and the corresponding iterator from the other range.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename Pred>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, std::pair<FwdIter1, FwdIter2>> mismatch(ExPolicy &&policy, FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, Pred &&op)#

Returns the first mismatching pair of elements from two ranges: one defined by [first1, last1) and another defined by [first2,last2). If last2 is not provided, it denotes first2 + (last1 - first1). Executed according to the policy.

The comparison operations in the parallel mismatch algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel mismatch algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most last1 - first1 applications of the predicate op or operator==. If FwdIter1 and FwdIter2 meet the requirements of RandomAccessIterator and (last1 - first1) != (last2 - first2) then no applications of the predicate op or operator== are made.

Note

The two ranges are considered mismatch if, for every iterator i in the range [first1,last1), *i mismatches *(first2 + (i - first1)). This overload of mismatch uses operator== to determine if two elements are mismatch.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of mismatch requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • op – The binary predicate which returns true if the elements should be treated as mismatch. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively

Returns

The mismatch algorithm returns a hpx::future<std::pair<FwdIter1,FwdIter2>> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns std::pair<FwdIter1,FwdIter2> otherwise. If no mismatches are found when the comparison reaches last1 or last2, whichever happens first, the pair holds the end iterator and the corresponding iterator from the other range.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, std::pair<FwdIter1, FwdIter2>> mismatch(ExPolicy &&policy, FwdIter1 first1, FwdIter1 last1, FwdIter2 first2)#

Returns the first mismatching pair of elements from two ranges: one defined by [first1, last1) and another defined by [first2,last2). If last2 is not provided, it denotes first2 + (last1 - first1). Executed according to the policy.

The comparison operations in the parallel mismatch algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel mismatch algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most last1 - first1 applications of operator==. If FwdIter1 and FwdIter2 meet the requirements of RandomAccessIterator and (last1 - first1) != (last2 - first2) then no applications of operator== are made.

Note

The two ranges are considered mismatch if, for every iterator i in the range [first1,last1), *i mismatches *(first2 + (i - first1)). This overload of mismatch uses operator== to determine if two elements are mismatch.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of mismatch requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

Returns

The mismatch algorithm returns a hpx::future<std::pair<FwdIter1,FwdIter2>> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns std::pair<FwdIter1,FwdIter2> otherwise. If no mismatches are found when the comparison reaches last1 or last2, whichever happens first, the pair holds the end iterator and the corresponding iterator from the other range.

template<typename FwdIter1, typename FwdIter2, typename Pred>
std::pair<FwdIter1, FwdIter2> mismatch(FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2, Pred &&op)#

Returns the first mismatching pair of elements from two ranges: one defined by [first1, last1) and another defined by [first2,last2). If last2 is not provided, it denotes first2 + (last1 - first1).

Note

Complexity: At most min(last1 - first1, last2 - first2) applications of the predicate op or operator==. If FwdIter1 and FwdIter2 meet the requirements of RandomAccessIterator and (last1 - first1) != (last2 - first2) then no applications of the predicate op or operator== are made.

Note

The two ranges are considered mismatch if, for every iterator i in the range [first1,last1), *i mismatches *(first2 + (i - first1)). This overload of mismatch uses operator== to determine if two elements are mismatch.

Template Parameters
  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of mismatch requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

Parameters
  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

  • op – The binary predicate which returns true if the elements should be treated as mismatch. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively

Returns

The mismatch algorithm returns a std::pair<FwdIter1,FwdIter2>. If no mismatches are found when the comparison reaches last1 or last2, whichever happens first, the pair holds the end iterator and the corresponding iterator from the other range.

template<typename FwdIter1, typename FwdIter2>
std::pair<FwdIter1, FwdIter2> mismatch(FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2)#

Returns the first mismatching pair of elements from two ranges: one defined by [first1, last1) and another defined by [first2,last2). If last2 is not provided, it denotes first2 + (last1 - first1).

Note

Complexity: At most min(last1 - first1, last2 - first2) applications of operator==. If FwdIter1 and FwdIter2 meet the requirements of RandomAccessIterator and (last1 - first1) != (last2 - first2) then no applications of operator== are made.

Note

The two ranges are considered mismatch if, for every iterator i in the range [first1,last1), *i mismatches *(first2 + (i - first1)). This overload of mismatch uses operator== to determine if two elements are mismatch.

Template Parameters
  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

Returns

The mismatch algorithm returns a std::pair<FwdIter1,FwdIter2>. If no mismatches are found when the comparison reaches last1 or last2, whichever happens first, the pair holds the end iterator and the corresponding iterator from the other range.

template<typename FwdIter1, typename FwdIter2, typename Pred>
std::pair<FwdIter1, FwdIter2> mismatch(FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, Pred &&op)#

Returns the first mismatching pair of elements from two ranges: one defined by [first1, last1) and another defined by [first2,last2). If last2 is not provided, it denotes first2 + (last1 - first1).

Note

Complexity: At most last1 - first1 applications of the predicate op or operator==. If FwdIter1 and FwdIter2 meet the requirements of RandomAccessIterator and (last1 - first1) != (last2 - first2) then no applications of the predicate op or operator== are made.

Note

The two ranges are considered mismatch if, for every iterator i in the range [first1,last1), *i mismatches *(first2 + (i - first1)). This overload of mismatch uses operator== to determine if two elements are mismatch.

Template Parameters
  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of mismatch requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

Parameters
  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • op – The binary predicate which returns true if the elements should be treated as mismatch. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively

Returns

The mismatch algorithm returns a std::pair<FwdIter1,FwdIter2>. If no mismatches are found when the comparison reaches last1 or last2, whichever happens first, the pair holds the end iterator and the corresponding iterator from the other range.

template<typename FwdIter1, typename FwdIter2>
std::pair<FwdIter1, FwdIter2> mismatch(FwdIter1 first1, FwdIter1 last1, FwdIter2 first2)#

Returns the first mismatching pair of elements from two ranges: one defined by [first1, last1) and another defined by [first2,last2). If last2 is not provided, it denotes first2 + (last1 - first1).

Note

Complexity: At most last1 - first1 applications of operator==. If FwdIter1 and FwdIter2 meet the requirements of RandomAccessIterator and (last1 - first1) != (last2 - first2) then no applications of operator== are made.

Note

The two ranges are considered mismatch if, for every iterator i in the range [first1,last1), *i mismatches *(first2 + (i - first1)). This overload of mismatch uses operator== to determine if two elements are mismatch.

Template Parameters
  • FwdIter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of mismatch requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

Parameters
  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

Returns

The mismatch algorithm returns a std::pair<FwdIter1,FwdIter2>. If no mismatches are found when the comparison reaches last1 or last2, whichever happens first, the pair holds the end iterator and the corresponding iterator from the other range.

hpx::move#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename ExPolicy, typename FwdIter1, typename FwdIter2>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> move(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest)#

Moves the elements in the range [first, last), to another range beginning at dest. After this operation the elements in the moved-from range will still contain valid values of the appropriate type, but not necessarily the same values as before the move. Executed according to the policy.

The move assignments in the parallel move algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The move assignments in the parallel move algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly last - first move assignments.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the move assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The move algorithm returns a hpx::future<FwdIter2>> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter2 otherwise. The move algorithm returns the output iterator to the element in the destination range, one past the last element moved.

template<typename FwdIter1, typename FwdIter2>
FwdIter2 move(FwdIter1 first, FwdIter1 last, FwdIter2 dest)#

Moves the elements in the range [first, last), to another range beginning at dest. After this operation the elements in the moved-from range will still contain valid values of the appropriate type, but not necessarily the same values as before the move.

Note

Complexity: Performs exactly last - first move assignments.

Template Parameters
  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The move algorithm returns a FwdIter2. The move algorithm returns the output iterator to the element in the destination range, one past the last element moved.

hpx::nth_element#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename RandomIt, typename Pred = hpx::parallel::detail::less>
void nth_element(RandomIt first, RandomIt nth, RandomIt last, Pred &&pred = Pred())#

nth_element is a partial sorting algorithm that rearranges elements in [first, last) such that the element pointed at by nth is changed to whatever element would occur in that position if [first, last) were sorted and all of the elements before this new nth element are less than or equal to the elements after the new nth element. Executed according to the policy.

The comparison operations in the parallel nth_element algorithm invoked without an execution policy object execute in sequential order in the calling thread.

Note

Complexity: Linear in std::distance(first, last) on average. O(N) applications of the predicate, and O(N log N) swaps, where N = last - first.

Template Parameters
  • RandomIt – The type of the source begin, nth, and end iterators used (deduced). This iterator type must meet the requirements of a random access iterator.

  • Pred – Comparison function object which returns true if the first argument is less than the second. This defaults to std::less<>.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • nth – Refers to the iterator defining the sort partition point

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • pred – Specifies the comparison function object which returns true if the first argument is less than (i.e. is ordered before) the second. The signature of this comparison function should be equivalent to:

    bool cmp(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type must be such that an object of type randomIt can be dereferenced and then implicitly converted to Type. This defaults to std::less<>.

Returns

The nth_element algorithms returns nothing.

template<typename ExPolicy, typename RandomIt, typename Pred = hpx::parallel::detail::less>
void nth_element(ExPolicy &&policy, RandomIt first, RandomIt nth, RandomIt last, Pred &&pred = Pred())#

nth_element is a partial sorting algorithm that rearranges elements in [first, last) such that the element pointed at by nth is changed to whatever element would occur in that position if [first, last) were sorted and all of the elements before this new nth element are less than or equal to the elements after the new nth element.

The comparison operations in the parallel nth_element invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel nth_element algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Linear in std::distance(first, last) on average. O(N) applications of the predicate, and O(N log N) swaps, where N = last - first.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • RandomIt – The type of the source begin, nth, and end iterators used (deduced). This iterator type must meet the requirements of a random access iterator.

  • Pred – Comparison function object which returns true if the first argument is less than the second. This defaults to std::less<>.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • nth – Refers to the iterator defining the sort partition point

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • pred – Specifies the comparison function object which returns true if the first argument is less than (i.e. is ordered before) the second. The signature of this comparison function should be equivalent to:

    bool cmp(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type must be such that an object of type randomIt can be dereferenced and then implicitly converted to Type. This defaults to std::less<>.

Returns

The nth_element algorithms returns nothing.

hpx::partial_sort#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename RandIter, typename Comp = hpx::parallel::detail::less>
RandIter partial_sort(RandIter first, RandIter middle, RandIter last, Comp &&comp = Comp())#

Places the first middle - first elements from the range [first, last) as sorted with respect to comp into the range [first, middle). The rest of the elements in the range [middle, last) are placed in an unspecified order.

Note

Complexity: Approximately (last - first) * log(middle - first) comparisons.

Template Parameters
  • RandIter – The type of the source begin, middle, and end iterators used (deduced). This iterator type must meet the requirements of a random access iterator.

  • Comp – The type of the function/function object to use (deduced). Comp defaults to detail::less.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • middle – Refers to the middle of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • comp – comp is a callable object. The return value of the INVOKE operation applied to an object of type Comp, when contextually converted to bool, yields true if the first argument of the call is less than the second, and false otherwise. It is assumed that comp will not apply any non-constant function through the dereferenced iterator. It defaults to detail::less.

Returns

The partial_sort algorithm returns a RandIter that refers to last.

template<typename ExPolicy, typename RandIter, typename Comp = hpx::parallel::detail::less>
parallel::util::detail::algorithm_result_t<ExPolicy, RandIter> partial_sort(ExPolicy &&policy, RandIter first, RandIter middle, RandIter last, Comp &&comp = Comp())#

Places the first middle - first elements from the range [first, last) as sorted with respect to comp into the range [first, middle). The rest of the elements in the range [middle, last) are placed in an unspecified order.

Note

Complexity: Approximately (last - first) * log(middle - first) comparisons.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • RandIter – The type of the source begin, middle, and end iterators used (deduced). This iterator type must meet the requirements of a random access iterator.

  • Comp – The type of the function/function object to use (deduced). Comp defaults to detail::less.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • middle – Refers to the middle of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • comp – comp is a callable object. The return value of the INVOKE operation applied to an object of type Comp, when contextually converted to bool, yields true if the first argument of the call is less than the second, and false otherwise. It is assumed that comp will not apply any non-constant function through the dereferenced iterator. It defaults to detail::less.

Returns

The partial_sort algorithm returns a hpx::future<RandIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns RandIter otherwise. The iterator returned refers to last.

hpx::partial_sort_copy#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename InIter, typename RandIter, typename Comp = hpx::parallel::detail::less>
RandIter partial_sort_copy(InIter first, InIter last, RandIter d_first, RandIter d_last, Comp &&comp = Comp())#

Sorts some of the elements in the range [first, last) in ascending order, storing the result in the range [d_first, d_last). At most d_last - d_first of the elements are placed sorted to the range [d_first, d_first + n) where n is the number of elements to sort (n = min(last - first, d_last - d_first)).

The application of function objects in parallel algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: O(N log(min(D,N))), where N = std::distance(first, last) and D = std::distance(d_first, d_last) comparisons.

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • RandIter – The type of the destination iterators used(deduced) This iterator type must meet the requirements of an random iterator.

  • Comp – The type of the function/function object to use (deduced). Comp defaults to detail::less.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • d_first – Refers to the beginning of the destination range.

  • d_last – Refers to the end of the destination range.

  • comp – comp is a callable object. The return value of the INVOKE operation applied to an object of type Comp, when contextually converted to bool, yields true if the first argument of the call is less than the second, and false otherwise. It is assumed that comp will not apply any non-constant function through the dereferenced iterator. This defaults to detail::less.

Returns

The partial_sort_copy algorithm returns a RandomIt. The algorithm returns an iterator to the element defining the upper boundary of the sorted range i.e. d_first + min(last - first, d_last - d_first)

template<typename ExPolicy, typename FwdIter, typename RandIter, typename Comp = hpx::parallel::detail::less>
parallel::util::detail::algorithm_result_t<ExPolicy, RandIter> partial_sort_copy(ExPolicy &&policy, FwdIter first, FwdIter last, RandIter d_first, RandIter d_last, Comp &&comp = Comp())#

Sorts some of the elements in the range [first, last) in ascending order, storing the result in the range [d_first, d_last). At most d_last - d_first of the elements are placed sorted to the range [d_first, d_first + n) where n is the number of elements to sort (n = min(last - first, d_last - d_first)). Executed according to the policy.

The application of function objects in parallel algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: O(N log(min(D,N))), where N = std::distance(first, last) and D = std::distance(d_first, d_last) comparisons.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • RandIter – The type of the destination iterators used(deduced) This iterator type must meet the requirements of an random iterator.

  • Comp – The type of the function/function object to use (deduced). Comp defaults to detail::less.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • d_first – Refers to the beginning of the destination range.

  • d_last – Refers to the end of the destination range.

  • comp – comp is a callable object. The return value of the INVOKE operation applied to an object of type Comp, when contextually converted to bool, yields true if the first argument of the call is less than the second, and false otherwise. It is assumed that comp will not apply any non-constant function through the dereferenced iterator. This defaults to detail::less.

Returns

The partial_sort_copy algorithm returns a hpx::future<RandomIt> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns RandomIt otherwise. The algorithm returns an iterator to the element defining the upper boundary of the sorted range i.e. d_first + min(last - first, d_last - d_first)

hpx::partition, hpx::stable_partition, hpx::partition_copy#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename FwdIter, typename Pred, typename Proj = hpx::identity>
FwdIter partition(FwdIter first, FwdIter last, Pred &&pred, Proj &&proj = Proj())#

Reorders the elements in the range [first, last) in such a way that all elements for which the predicate pred returns true precede the elements for which the predicate pred returns false. Relative order of the elements is not preserved.

The assignments in the parallel partition algorithm invoked without an execution policy object execute in sequential order in the calling thread.

Note

Complexity: At most 2 * (last - first) swaps. Exactly last - first applications of the predicate and projection.

Template Parameters
  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • Pred – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of partition requires Pred to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • pred – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). This is an unary predicate for partitioning the source iterators. The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The partition algorithm returns returns FwdIter. The partition algorithm returns the iterator to the first element of the second group.

template<typename ExPolicy, typename FwdIter, typename Pred, typename Proj = hpx::identity>
parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter> partition(ExPolicy &&policy, FwdIter first, FwdIter last, Pred &&pred, Proj &&proj = Proj())#

Reorders the elements in the range [first, last) in such a way that all elements for which the predicate pred returns true precede the elements for which the predicate pred returns false. Relative order of the elements is not preserved.

The assignments in the parallel partition algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel partition algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most 2 * (last - first) swaps. Exactly last - first applications of the predicate and projection.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of partition requires Pred to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • pred – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). This is an unary predicate for partitioning the source iterators. The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The partition algorithm returns a hpx::future<FwdIter> if the execution policy is of type parallel_task_policy and returns FwdIter otherwise. The partition algorithm returns the iterator to the first element of the second group.

template<typename BidirIter, typename F, typename Proj = hpx::identity>
BidirIter stable_partition(BidirIter first, BidirIter last, F &&f, Proj &&proj = Proj())#

Permutes the elements in the range [first, last) such that there exists an iterator i such that for every iterator j in the range [first, i) INVOKE(f, INVOKE (proj, *j)) != false, and for every iterator k in the range [i, last), INVOKE(f, INVOKE (proj, *k)) == false

The invocations of f in the parallel stable_partition algorithm invoked without an execution policy object executes in sequential order in the calling thread.

Note

Complexity: At most (last - first) * log(last - first) swaps, but only linear number of swaps if there is enough extra memory. Exactly last - first applications of the predicate and projection.

Template Parameters
  • BidirIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a bidirectional iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of transform requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – Unary predicate which returns true if the element should be ordered before other elements. Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    bool fun(const Type &a);
    
    The signature does not need to have const&. The type Type must be such that an object of type BidirIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate f is invoked.

Returns

The stable_partition algorithm returns an iterator i such that for every iterator j in the range [first, i), f(*j) != false INVOKE(f, INVOKE(proj, *j)) != false, and for every iterator k in the range [i, last), f(*k) == false INVOKE(f, INVOKE (proj, *k)) == false. The relative order of the elements in both groups is preserved.

template<typename ExPolicy, typename BidirIter, typename F, typename Proj = hpx::identity>
parallel::util::detail::algorithm_result_t<ExPolicy, BidirIter> stable_partition(ExPolicy &&policy, BidirIter first, BidirIter last, F &&f, Proj &&proj = Proj())#

Permutes the elements in the range [first, last) such that there exists an iterator i such that for every iterator j in the range [first, i) INVOKE(f, INVOKE (proj, *j)) != false, and for every iterator k in the range [i, last), INVOKE(f, INVOKE (proj, *k)) == false

The invocations of f in the parallel stable_partition algorithm invoked with an execution policy object of type sequenced_policy executes in sequential order in the calling thread.

The invocations of f in the parallel stable_partition algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most (last - first) * log(last - first) swaps, but only linear number of swaps if there is enough extra memory. Exactly last - first applications of the predicate and projection.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the invocations of f.

  • BidirIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a bidirectional iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of transform requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – Unary predicate which returns true if the element should be ordered before other elements. Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    bool fun(const Type &a);
    
    The signature does not need to have const&. The type Type must be such that an object of type BidirIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate f is invoked.

Returns

The stable_partition algorithm returns an iterator i such that for every iterator j in the range [first, i), f(*j) != false INVOKE(f, INVOKE(proj, *j)) != false, and for every iterator k in the range [i, last), f(*k) == false INVOKE(f, INVOKE (proj, *k)) == false. The relative order of the elements in both groups is preserved. If the execution policy is of type parallel_task_policy the algorithm returns a future<> referring to this iterator.

template<typename FwdIter1, typename FwdIter2, typename FwdIter3, typename Pred, typename Proj = hpx::identity>
std::pair<FwdIter2, FwdIter3> partition_copy(FwdIter1 first, FwdIter1 last, FwdIter2 dest_true, FwdIter3 dest_false, Pred &&pred, Proj &&proj = Proj())#

Copies the elements in the range, defined by [first, last), to two different ranges depending on the value returned by the predicate pred. The elements, that satisfy the predicate pred are copied to the range beginning at dest_true. The rest of the elements are copied to the range beginning at dest_false. The order of the elements is preserved.

The assignments in the parallel partition_copy algorithm invoked without an execution policy object execute in sequential order in the calling thread.

Note

Complexity: Performs not more than last - first assignments, exactly last - first applications of the predicate pred.

Template Parameters
  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range for the elements that satisfy the predicate pred (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter3 – The type of the iterator representing the destination range for the elements that don’t satisfy the predicate pred (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of partition_copy requires Pred to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest_true – Refers to the beginning of the destination range for the elements that satisfy the predicate pred

  • dest_false – Refers to the beginning of the destination range for the elements that don’t satisfy the predicate pred.

  • pred – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). This is an unary predicate for partitioning the source iterators. The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter1 can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The partition_copy algorithm returns std::pair<OutIter1, OutIter2>. The partition_copy algorithm returns the pair of the destination iterator to the end of the dest_true range, and the destination iterator to the end of the dest_false range.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename FwdIter3, typename Pred, typename Proj = hpx::identity>
parallel::util::detail::algorithm_result_t<ExPolicy, std::pair<FwdIter2, FwdIter3>> partition_copy(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest_true, FwdIter3 dest_false, Pred &&pred, Proj &&proj = Proj())#

Copies the elements in the range, defined by [first, last), to two different ranges depending on the value returned by the predicate pred. The elements, that satisfy the predicate pred, are copied to the range beginning at dest_true. The rest of the elements are copied to the range beginning at dest_false. The order of the elements is preserved.

The assignments in the parallel partition_copy algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel partition_copy algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs not more than last - first assignments, exactly last - first applications of the predicate pred.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range for the elements that satisfy the predicate pred (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter3 – The type of the iterator representing the destination range for the elements that don’t satisfy the predicate pred (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of partition_copy requires Pred to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest_true – Refers to the beginning of the destination range for the elements that satisfy the predicate pred

  • dest_false – Refers to the beginning of the destination range for the elements that don’t satisfy the predicate pred.

  • pred – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). This is an unary predicate for partitioning the source iterators. The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter1 can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The partition_copy algorithm returns a hpx::future<std::pair<OutIter1, OutIter2>> if the execution policy is of type parallel_task_policy and returns std::pair<OutIter1, OutIter2> otherwise. The partition_copy algorithm returns the pair of the destination iterator to the end of the dest_true range, and the destination iterator to the end of the dest_false range.

hpx::reduce#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename ExPolicy, typename FwdIter, typename F, typename T = typename std::iterator_traits<FwdIter>::value_type>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, T> reduce(ExPolicy &&policy, FwdIter first, FwdIter last, T init, F &&f)#

Returns GENERALIZED_SUM(f, init, *first, …, *(first + (last - first) - 1)). Executed according to the policy.

The reduce operations in the parallel reduce algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The reduce operations in the parallel copy_if algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

The difference between reduce and accumulate is that the behavior of reduce may be non-deterministic for non-associative or non-commutative binary predicate.

Note

Complexity: O(last - first) applications of the predicate f.

Note

GENERALIZED_SUM(op, a1, …, aN) is defined as follows:

  • a1 when N is 1

  • op(GENERALIZED_SUM(op, b1, …, bK), GENERALIZED_SUM(op, bM, …, bN)), where:

    • b1, …, bN may be any permutation of a1, …, aN and

    • 1 < K+1 = M <= N.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source begin and end iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of reduce requires F to meet the requirements of CopyConstructible.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • init – The initial value for the generalized sum.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). This is a binary predicate. The signature of this predicate should be equivalent to:

    Ret fun(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const&. The types Type1 Ret must be such that an object of type FwdIter can be dereferenced and then implicitly converted to any of those types.

Returns

The reduce algorithm returns a hpx::future<T> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns T otherwise. The reduce algorithm returns the result of the generalized sum over the elements given by the input range [first, last).

template<typename ExPolicy, typename FwdIter, typename T = typename std::iterator_traits<FwdIter>::value_type>
util::detail::algorithm_result_t<ExPolicy, T> reduce(ExPolicy &&policy, FwdIter first, FwdIter last, T init)#

Returns GENERALIZED_SUM(+, init, *first, …, *(first + (last - first) - 1)). Executed according to the policy.

The reduce operations in the parallel reduce algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The reduce operations in the parallel copy_if algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

The difference between reduce and accumulate is that the behavior of reduce may be non-deterministic for non-associative or non-commutative binary predicate.

Note

Complexity: O(last - first) applications of the operator+().

Note

GENERALIZED_SUM(+, a1, …, aN) is defined as follows:

  • a1 when N is 1

  • op(GENERALIZED_SUM(+, b1, …, bK), GENERALIZED_SUM(+, bM, …, bN)), where:

    • b1, …, bN may be any permutation of a1, …, aN and

    • 1 < K+1 = M <= N.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source begin and end iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • init – The initial value for the generalized sum.

Returns

The reduce algorithm returns a hpx::future<T> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns T otherwise. The reduce algorithm returns the result of the generalized sum (applying operator+()) over the elements given by the input range [first, last).

template<typename ExPolicy, typename FwdIter>
hpx::parallel::util::detail::algorithm_result<ExPolicy, typename std::iterator_traits<FwdIter>::value_type>::type reduce(ExPolicy &&policy, FwdIter first, FwdIter last)#

Returns GENERALIZED_SUM(+, T(), *first, …, *(first + (last - first) - 1)). Executed according to the policy.

The reduce operations in the parallel reduce algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The reduce operations in the parallel reduce algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

The difference between reduce and accumulate is that the behavior of reduce may be non-deterministic for non-associative or non-commutative binary predicate.

Note

Complexity: O(last - first) applications of the operator+().

Note

The type of the initial value (and the result type) T is determined from the value_type of the used FwdIter.

Note

GENERALIZED_SUM(+, a1, …, aN) is defined as follows:

  • a1 when N is 1

  • op(GENERALIZED_SUM(+, b1, …, bK), GENERALIZED_SUM(+, bM, …, bN)), where:

    • b1, …, bN may be any permutation of a1, …, aN and

    • 1 < K+1 = M <= N.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source begin and end iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

Returns

The reduce algorithm returns a hpx::future<T> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns T otherwise (where T is the value_type of FwdIter). The reduce algorithm returns the result of the generalized sum (applying operator+()) over the elements given by the input range [first, last).

template<typename FwdIter, typename F, typename T = typename std::iterator_traits<FwdIter>::value_type>
T reduce(FwdIter first, FwdIter last, T init, F &&f)#

Returns GENERALIZED_SUM(f, init, *first, …, *(first + (last - first) - 1)). Executed according to the policy.

The difference between reduce and accumulate is that the behavior of reduce may be non-deterministic for non-associative or non-commutative binary predicate.

Note

Complexity: O(last - first) applications of the predicate f.

Note

GENERALIZED_SUM(op, a1, …, aN) is defined as follows:

  • a1 when N is 1

  • op(GENERALIZED_SUM(op, b1, …, bK), GENERALIZED_SUM(op, bM, …, bN)), where:

    • b1, …, bN may be any permutation of a1, …, aN and

    • 1 < K+1 = M <= N.

Template Parameters
  • FwdIter – The type of the source begin and end iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of reduce requires F to meet the requirements of CopyConstructible.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • init – The initial value for the generalized sum.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). This is a binary predicate. The signature of this predicate should be equivalent to:

    Ret fun(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const&. The types Type1 Ret must be such that an object of type InIter can be dereferenced and then implicitly converted to any of those types.

Returns

The reduce algorithm returns T. The reduce algorithm returns the result of the generalized sum over the elements given by the input range [first, last).

template<typename FwdIter, typename T = typename std::iterator_traits<FwdIter>::value_type>
T reduce(FwdIter first, FwdIter last, T init)#

Returns GENERALIZED_SUM(+, init, *first, …, *(first + (last - first) - 1)). Executed according to the policy.

The difference between reduce and accumulate is that the behavior of reduce may be non-deterministic for non-associative or non-commutative binary predicate.

Note

Complexity: O(last - first) applications of the operator+().

Note

GENERALIZED_SUM(+, a1, …, aN) is defined as follows:

  • a1 when N is 1

  • op(GENERALIZED_SUM(+, b1, …, bK), GENERALIZED_SUM(+, bM, …, bN)), where:

    • b1, …, bN may be any permutation of a1, …, aN and

    • 1 < K+1 = M <= N.

Template Parameters
  • FwdIter – The type of the source begin and end iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • init – The initial value for the generalized sum.

Returns

The reduce algorithm returns a T. The reduce algorithm returns the result of the generalized sum (applying operator+()) over the elements given by the input range [first, last).

template<typename FwdIter>
std::iterator_traits<FwdIter>::value_type reduce(FwdIter first, FwdIter last)#

Returns GENERALIZED_SUM(+, T(), *first, …, *(first + (last - first) - 1)). Executed according to the policy.

The difference between reduce and accumulate is that the behavior of reduce may be non-deterministic for non-associative or non-commutative binary predicate.

Note

Complexity: O(last - first) applications of the operator+().

Note

The type of the initial value (and the result type) T is determined from the value_type of the used FwdIter.

Note

GENERALIZED_SUM(+, a1, …, aN) is defined as follows:

  • a1 when N is 1

  • op(GENERALIZED_SUM(+, b1, …, bK), GENERALIZED_SUM(+, bM, …, bN)), where:

    • b1, …, bN may be any permutation of a1, …, aN and

    • 1 < K+1 = M <= N.

Template Parameters

FwdIter – The type of the source begin and end iterators used (deduced). This iterator type must meet the requirements of an input iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

Returns

The reduce algorithm returns T (where T is the value_type of FwdIter). The reduce algorithm returns the result of the generalized sum (applying operator+()) over the elements given by the input range [first, last).

hpx::reduce_by_key#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace experimental

Top-level namespace.

Functions

template<typename ExPolicy, typename RanIter, typename RanIter2, typename FwdIter1, typename FwdIter2, typename Compare = std::equal_to<typename std::iterator_traits<RanIter>::value_type>, typename Func = std::plus<typename std::iterator_traits<RanIter2>::value_type>>
util::detail::algorithm_result<ExPolicy, util::in_out_result<FwdIter1, FwdIter2>>::type reduce_by_key(ExPolicy &&policy, RanIter key_first, RanIter key_last, RanIter2 values_first, FwdIter1 keys_output, FwdIter2 values_output, Compare &&comp = Compare(), Func &&func = Func())#

Reduce by Key performs an inclusive scan reduction operation on elements supplied in key/value pairs. The algorithm produces a single output value for each set of equal consecutive keys in [key_first, key_last). the value being the GENERALIZED_NONCOMMUTATIVE_SUM(op, init, *first, …, *(first + (i - result))). for the run of consecutive matching keys. The number of keys supplied must match the number of values.

comp has to induce a strict weak ordering on the values.

The application of function objects in parallel algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: O(last - first) applications of the predicate op.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • RanIter – The type of the key iterators used (deduced). This iterator type must meet the requirements of a random access iterator.

  • RanIter2 – The type of the value iterators used (deduced). This iterator type must meet the requirements of a random access iterator.

  • FwdIter1 – The type of the iterator representing the destination key range (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the iterator representing the destination value range (deduced). This iterator type must meet the requirements of a forward iterator.

  • Compare – The type of the optional function/function object to use to compare keys (deduced). Assumed to be std::equal_to otherwise.

  • Func – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of reduce_by_key requires Func to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • key_first – Refers to the beginning of the sequence of key elements the algorithm will be applied to.

  • key_last – Refers to the end of the sequence of key elements the algorithm will be applied to.

  • values_first – Refers to the beginning of the sequence of value elements the algorithm will be applied to.

  • keys_output – Refers to the start output location for the keys produced by the algorithm.

  • values_output – Refers to the start output location for the values produced by the algorithm.

  • comp – comp is a callable object. The return value of the INVOKE operation applied to an object of type Comp, when contextually converted to bool, yields true if the first argument of the call is less than the second, and false otherwise. It is assumed that comp will not apply any non-constant function through the dereferenced iterator.

  • func – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). This is a binary predicate. The signature of this predicate should be equivalent to:

    Ret fun(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const&. The types Type1 Ret must be such that an object of type FwdIter can be dereferenced and then implicitly converted to any of those types.

Returns

The reduce_by_key algorithm returns a hpx::future<pair<Iter1,Iter2>> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns pair<Iter1,Iter2> otherwise.

hpx::reduce_deterministic#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename ExPolicy, typename FwdIter, typename F, typename T = typename std::iterator_traits<FwdIter>::value_type>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, T> reduce_deterministic(ExPolicy &&policy, FwdIter first, FwdIter last, T init, F &&f)#

Returns GENERALIZED_SUM(f, init, *first, …, *(first + (last - first) - 1)). Executed according to the policy.

The reduce operations in the parallel reduce algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The reduce operations in the parallel copy_if algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

The difference between reduce and accumulate is that the behavior of reduce may be non-deterministic for non-associative or non-commutative binary predicate.

Note

Complexity: O(last - first) applications of the predicate f.

Note

GENERALIZED_SUM(op, a1, …, aN) is defined as follows:

  • a1 when N is 1

  • op(GENERALIZED_SUM(op, b1, …, bK), GENERALIZED_SUM(op, bM, …, bN)), where:

    • b1, …, bN may be any permutation of a1, …, aN and

    • 1 < K+1 = M <= N.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source begin and end iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of reduce requires F to meet the requirements of CopyConstructible.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • init – The initial value for the generalized sum.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). This is a binary predicate. The signature of this predicate should be equivalent to:

    Ret fun(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const&. The types Type1 Ret must be such that an object of type FwdIter can be dereferenced and then implicitly converted to any of those types.

Returns

The reduce algorithm returns a hpx::future<T> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns T otherwise. The reduce algorithm returns the result of the generalized sum over the elements given by the input range [first, last).

template<typename ExPolicy, typename FwdIter, typename T = typename std::iterator_traits<FwdIter>::value_type>
util::detail::algorithm_result_t<ExPolicy, T> reduce_deterministic(ExPolicy &&policy, FwdIter first, FwdIter last, T init)#

Returns GENERALIZED_SUM(+, init, *first, …, *(first + (last - first) - 1)). Executed according to the policy.

The reduce operations in the parallel reduce algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The reduce operations in the parallel copy_if algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

The difference between reduce and accumulate is that the behavior of reduce may be non-deterministic for non-associative or non-commutative binary predicate.

Note

Complexity: O(last - first) applications of the operator+().

Note

GENERALIZED_SUM(+, a1, …, aN) is defined as follows:

  • a1 when N is 1

  • op(GENERALIZED_SUM(+, b1, …, bK), GENERALIZED_SUM(+, bM, …, bN)), where:

    • b1, …, bN may be any permutation of a1, …, aN and

    • 1 < K+1 = M <= N.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source begin and end iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • init – The initial value for the generalized sum.

Returns

The reduce algorithm returns a hpx::future<T> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns T otherwise. The reduce algorithm returns the result of the generalized sum (applying operator+()) over the elements given by the input range [first, last).

template<typename ExPolicy, typename FwdIter>
hpx::parallel::util::detail::algorithm_result<ExPolicy, typename std::iterator_traits<FwdIter>::value_type>::type reduce_deterministic(ExPolicy &&policy, FwdIter first, FwdIter last)#

Returns GENERALIZED_SUM(+, T(), *first, …, *(first + (last - first) - 1)). Executed according to the policy.

The reduce operations in the parallel reduce algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The reduce operations in the parallel reduce algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

The difference between reduce and accumulate is that the behavior of reduce may be non-deterministic for non-associative or non-commutative binary predicate.

Note

Complexity: O(last - first) applications of the operator+().

Note

The type of the initial value (and the result type) T is determined from the value_type of the used FwdIter.

Note

GENERALIZED_SUM(+, a1, …, aN) is defined as follows:

  • a1 when N is 1

  • op(GENERALIZED_SUM(+, b1, …, bK), GENERALIZED_SUM(+, bM, …, bN)), where:

    • b1, …, bN may be any permutation of a1, …, aN and

    • 1 < K+1 = M <= N.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source begin and end iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

Returns

The reduce algorithm returns a hpx::future<T> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns T otherwise (where T is the value_type of FwdIter). The reduce algorithm returns the result of the generalized sum (applying operator+()) over the elements given by the input range [first, last).

template<typename FwdIter, typename F, typename T = typename std::iterator_traits<FwdIter>::value_type>
T reduce_deterministic(FwdIter first, FwdIter last, T init, F &&f)#

Returns GENERALIZED_SUM(f, init, *first, …, *(first + (last - first) - 1)). Executed according to the policy.

The difference between reduce and accumulate is that the behavior of reduce may be non-deterministic for non-associative or non-commutative binary predicate.

Note

Complexity: O(last - first) applications of the predicate f.

Note

GENERALIZED_SUM(op, a1, …, aN) is defined as follows:

  • a1 when N is 1

  • op(GENERALIZED_SUM(op, b1, …, bK), GENERALIZED_SUM(op, bM, …, bN)), where:

    • b1, …, bN may be any permutation of a1, …, aN and

    • 1 < K+1 = M <= N.

Template Parameters
  • FwdIter – The type of the source begin and end iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of reduce requires F to meet the requirements of CopyConstructible.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • init – The initial value for the generalized sum.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). This is a binary predicate. The signature of this predicate should be equivalent to:

    Ret fun(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const&. The types Type1 Ret must be such that an object of type InIter can be dereferenced and then implicitly converted to any of those types.

Returns

The reduce algorithm returns T. The reduce algorithm returns the result of the generalized sum over the elements given by the input range [first, last).

template<typename FwdIter, typename T = typename std::iterator_traits<FwdIter>::value_type>
T reduce_deterministic(FwdIter first, FwdIter last, T init)#

Returns GENERALIZED_SUM(+, init, *first, …, *(first + (last - first) - 1)). Executed according to the policy.

The difference between reduce and accumulate is that the behavior of reduce may be non-deterministic for non-associative or non-commutative binary predicate.

Note

Complexity: O(last - first) applications of the operator+().

Note

GENERALIZED_SUM(+, a1, …, aN) is defined as follows:

  • a1 when N is 1

  • op(GENERALIZED_SUM(+, b1, …, bK), GENERALIZED_SUM(+, bM, …, bN)), where:

    • b1, …, bN may be any permutation of a1, …, aN and

    • 1 < K+1 = M <= N.

Template Parameters
  • FwdIter – The type of the source begin and end iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • init – The initial value for the generalized sum.

Returns

The reduce algorithm returns a T. The reduce algorithm returns the result of the generalized sum (applying operator+()) over the elements given by the input range [first, last).

template<typename FwdIter>
std::iterator_traits<FwdIter>::value_type reduce_deterministic(FwdIter first, FwdIter last)#

Returns GENERALIZED_SUM(+, T(), *first, …, *(first + (last - first) - 1)). Executed according to the policy.

The difference between reduce and accumulate is that the behavior of reduce may be non-deterministic for non-associative or non-commutative binary predicate.

Note

Complexity: O(last - first) applications of the operator+().

Note

The type of the initial value (and the result type) T is determined from the value_type of the used FwdIter.

Note

GENERALIZED_SUM(+, a1, …, aN) is defined as follows:

  • a1 when N is 1

  • op(GENERALIZED_SUM(+, b1, …, bK), GENERALIZED_SUM(+, bM, …, bN)), where:

    • b1, …, bN may be any permutation of a1, …, aN and

    • 1 < K+1 = M <= N.

Template Parameters

FwdIter – The type of the source begin and end iterators used (deduced). This iterator type must meet the requirements of an input iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

Returns

The reduce algorithm returns T (where T is the value_type of FwdIter). The reduce algorithm returns the result of the generalized sum (applying operator+()) over the elements given by the input range [first, last).

hpx::remove, hpx::remove_if#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename FwdIter, typename T = typename std::iterator_traits<FwdIter>::value_type>
FwdIter remove(FwdIter first, FwdIter last, T const &value)#

Removes all elements satisfying specific criteria from the range [first, last) and returns a past-the-end iterator for the new end of the range. This version removes all elements that are equal to value.

The assignments in the parallel remove algorithm execute in sequential order in the calling thread.

Note

Complexity: Performs not more than last - first assignments, exactly last - first applications of the operator==().

Template Parameters
  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • T – The type of the value to remove (deduced). This value type must meet the requirements of CopyConstructible.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • value – Specifies the value of elements to remove.

Returns

The remove algorithm returns a FwdIter. The remove algorithm returns the iterator to the new end of the range.

template<typename ExPolicy, typename FwdIter, typename T = typename std::iterator_traits<FwdIter>::value_type>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter> remove(ExPolicy &&policy, FwdIter first, FwdIter last, T const &value)#

Removes all elements satisfying specific criteria from the range [first, last) and returns a past-the-end iterator for the new end of the range. This version removes all elements that are equal to value. Executed according to the policy.

The assignments in the parallel remove algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel remove algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs not more than last - first assignments, exactly last - first applications of the operator==().

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • T – The type of the value to remove (deduced). This value type must meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • value – Specifies the value of elements to remove.

Returns

The remove algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The remove algorithm returns the iterator to the new end of the range.

template<typename FwdIter, typename Pred>
FwdIter remove_if(FwdIter first, FwdIter last, Pred &&pred)#

Removes all elements satisfying specific criteria from the range [first, last) and returns a past-the-end iterator for the new end of the range. This version removes all elements for which predicate pred returns true.

The assignments in the parallel remove_if algorithm execute in sequential order in the calling thread.

Note

Complexity: Performs not more than last - first assignments, exactly last - first applications of the predicate pred.

Template Parameters
  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of remove_if requires Pred to meet the requirements of CopyConstructible.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • pred – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last).This is an unary predicate which returns true for the required elements. The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

Returns

The remove_if algorithm returns a FwdIter. The remove_if algorithm returns the iterator to the new end of the range.

template<typename ExPolicy, typename FwdIter, typename Pred>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter> remove_if(ExPolicy &&policy, FwdIter first, FwdIter last, Pred &&pred)#

Removes all elements satisfying specific criteria from the range [first, last) and returns a past-the-end iterator for the new end of the range. This version removes all elements for which predicate pred returns true. Executed according to the policy.

The assignments in the parallel remove_if algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel remove_if algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs not more than last - first assignments, exactly last - first applications of the predicate pred.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of remove_if requires Pred to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • pred – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last).This is an unary predicate which returns true for the required elements. The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

Returns

The remove_if algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The remove_if algorithm returns the iterator to the new end of the range.

hpx::remove_copy, hpx::remove_copy_if#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename InIter, typename OutIter, typename T = typename std::iterator_traits<InIter>::value_type>
OutIter remove_copy(InIter first, InIter last, OutIter dest, T const &value)#

Copies the elements in the range, defined by [first, last), to another range beginning at dest. Copies only the elements for which the comparison operator returns false when compare to value. The order of the elements that are not removed is preserved.

Effects: Copies all the elements referred to by the iterator it in the range [first,last) for which the following corresponding conditions do not hold: *it == value

The assignments in the parallel remove_copy algorithm execute in sequential order in the calling thread.

Note

Complexity: Performs not more than last - first assignments, exactly last - first applications of the predicate pred, here comparison operator.

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • OutIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an output iterator.

  • T – The type that the result of dereferencing FwdIter1 is compared to.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • value – Value to be removed.

Returns

The remove_copy algorithm returns an OutIter. The remove_copy algorithm returns the iterator to the element past the last element copied.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename T = typename std::iterator_traits<InIter>::value_type>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> remove_copy(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest, T const &value)#

Copies the elements in the range, defined by [first, last), to another range beginning at dest. Copies only the elements for which the comparison operator returns false when compare to value. The order of the elements that are not removed is preserved. Executed according to the policy.

Effects: Copies all the elements referred to by the iterator it in the range [first,last) for which the following corresponding conditions do not hold: *it == value

The assignments in the parallel remove_copy algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel remove_copy algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs not more than last - first assignments, exactly last - first applications of the predicate pred, here comparison operator.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

  • T – The type that the result of dereferencing FwdIter1 is compared to.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • value – Value to be removed.

Returns

The remove_copy algorithm returns a hpx::future<FwdIter2> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter2 otherwise. The remove_copy algorithm returns the iterator to the element past the last element copied.

template<typename InIter, typename OutIter, typename Pred>
OutIter remove_copy_if(InIter first, InIter last, OutIter dest, Pred &&pred)#

Copies the elements in the range, defined by [first, last), to another range beginning at dest. Copies only the elements for which the predicate pred returns false. The order of the elements that are not removed is preserved.

Effects: Copies all the elements referred to by the iterator it in the range [first,last) for which the following corresponding conditions do not hold: INVOKE(pred, *it) != false.

The assignments in the parallel remove_copy_if algorithm execute in sequential order in the calling thread.

Note

Complexity: Performs not more than last - first assignments, exactly last - first applications of the predicate pred.

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • OutIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an output iterator.

  • Pred – The type of the function/function object to use (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • pred – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last).This is an unary predicate which returns true for the elements to be removed. The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type InIter can be dereferenced and then implicitly converted to Type.

Returns

The remove_copy_if algorithm returns an OutIter The remove_copy_if algorithm returns the iterator to the element past the last element copied.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename Pred>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> remove_copy_if(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest, Pred &&pred)#

Copies the elements in the range, defined by [first, last), to another range beginning at dest. Copies only the elements for which the predicate pred returns false. The order of the elements that are not removed is preserved. Executed according to the policy.

Effects: Copies all the elements referred to by the iterator it in the range [first,last) for which the following corresponding conditions do not hold: INVOKE(pred, *it) != false.

The assignments in the parallel remove_copy_if algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel remove_copy_if algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs not more than last - first assignments, exactly last - first applications of the predicate pred.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Pred – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of remove_copy_if requires Pred to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • pred – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last).This is an unary predicate which returns true for the elements to be removed. The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter1 can be dereferenced and then implicitly converted to Type.

Returns

The remove_copy_if algorithm returns a hpx::future<FwdIter2> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter2 otherwise. The remove_copy_if algorithm returns the iterator to the element past the last element copied.

hpx::replace, hpx::replace_if, hpx::replace_copy, hpx::replace_copy_if#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename InIter, typename T = typename std::iterator_traits<InIter>::value_type>
void replace(InIter first, InIter last, T const &old_value, T const &new_value)#

Replaces all elements satisfying specific criteria with new_value in the range [first, last).

Effects: Substitutes elements referred by the iterator it in the range [first, last) with new_value, when the following corresponding conditions hold: *it == old_value

The assignments in the parallel replace algorithm execute in sequential order in the calling thread.

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • T – The type of the old and new values to replace (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • old_value – Refers to the old value of the elements to replace.

  • new_value – Refers to the new value to use as the replacement.

Returns

The replace algorithm returns a void.

template<typename ExPolicy, typename FwdIter, typename T = typename std::iterator_traits<FwdIter>::value_type>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, void> replace(ExPolicy &&policy, FwdIter first, FwdIter last, T const &old_value, T const &new_value)#

Replaces all elements satisfying specific criteria with new_value in the range [first, last). Executed according to the policy.

Effects: Substitutes elements referred by the iterator it in the range [first, last) with new_value, when the following corresponding conditions hold: *it == old_value

The assignments in the parallel replace algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel replace algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • T – The type of the old and new values to replace (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • old_value – Refers to the old value of the elements to replace.

  • new_value – Refers to the new value to use as the replacement.

Returns

The replace algorithm returns a hpx::future<void> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns void otherwise.

template<typename Iter, typename Pred, typename T = typename std::iterator_traits<Iter>::value_type>
void replace_if(Iter first, Iter last, Pred &&pred, T const &new_value)#

Replaces all elements satisfying specific criteria (for which predicate pred returns true) with new_value in the range [first, last).

Effects: Substitutes elements referred by the iterator it in the range [first, last) with new_value, when the following corresponding conditions hold: INVOKE(f, *it) != false

The assignments in the parallel replace_if algorithm execute in sequential order in the calling thread.

Note

Complexity: Performs exactly last - first applications of the predicate.

Template Parameters
  • Iter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • Pred – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of equal requires Pred to meet the requirements of CopyConstructible. (deduced).

  • T – The type of the new values to replace (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • pred – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last).This is an unary predicate which returns true for the elements which need to replaced. The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type InIter can be dereferenced and then implicitly converted to Type.

  • new_value – Refers to the new value to use as the replacement.

Returns

The replace_if algorithm returns void.

template<typename ExPolicy, typename FwdIter, typename Pred, typename T = typename std::iterator_traits<FwdIter>::value_type>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, void> replace_if(ExPolicy &&policy, FwdIter first, FwdIter last, Pred &&pred, T const &new_value)#

Replaces all elements satisfying specific criteria (for which predicate f returns true) with new_value in the range [first, last). Executed according to the policy.

Effects: Substitutes elements referred by the iterator it in the range [first, last) with new_value, when the following corresponding conditions hold: INVOKE(f, *it) != false

The assignments in the parallel replace_if algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel replace_if algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly last - first applications of the predicate.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • Pred – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of equal requires Pred to meet the requirements of CopyConstructible. (deduced).

  • T – The type of the new values to replace (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • pred – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last).This is an unary predicate which returns true for the elements which need to replaced. The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

  • new_value – Refers to the new value to use as the replacement.

Returns

The replace_if algorithm returns a hpx::future<void> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns void otherwise.

template<typename InIter, typename OutIter, typename T = typename std::iterator_traits<OutIter>::value_type>
OutIter replace_copy(InIter first, InIter last, OutIter dest, T const &old_value, T const &new_value)#

Copies the all elements from the range [first, last) to another range beginning at dest replacing all elements satisfying a specific criteria with new_value.

Effects: Assigns to every iterator it in the range [result, result + (last - first)) either new_value or *(first + (it - result)) depending on whether the following corresponding condition holds: *(first + (i - result)) == old_value

The assignments in the parallel replace_copy algorithm execute in sequential order in the calling thread.

Note

Complexity: Performs exactly last - first applications of the predicate.

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • OutIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an output iterator.

  • T – The type of the old and new values (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • old_value – Refers to the old value of the elements to replace.

  • new_value – Refers to the new value to use as the replacement.

Returns

The replace_copy algorithm returns an OutIter The replace_copy algorithm returns the Iterator to the element past the last element copied.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename T = typename std::iterator_traits<FwdIter2>::value_type>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> replace_copy(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest, T const &old_value, T const &new_value)#

Copies the all elements from the range [first, last) to another range beginning at dest replacing all elements satisfying a specific criteria with new_value. Executed according to the policy.

Effects: Assigns to every iterator it in the range [result, result + (last - first)) either new_value or *(first + (it - result)) depending on whether the following corresponding condition holds: *(first + (i - result)) == old_value

The assignments in the parallel replace_copy algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel replace_copy algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly last - first applications of the predicate.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

  • T – The type of the old and new values (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • old_value – Refers to the old value of the elements to replace.

  • new_value – Refers to the new value to use as the replacement.

Returns

The replace_copy algorithm returns a hpx::future<FwdIter2> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter2 otherwise. The replace_copy algorithm returns the Iterator to the element past the last element copied.

template<typename InIter, typename OutIter, typename Pred, typename T = typename std::iterator_traits<OutIter>::value_type>
OutIter replace_copy_if(InIter first, InIter last, OutIter dest, Pred &&pred, T const &new_value)#

Copies the all elements from the range [first, last) to another range beginning at dest replacing all elements satisfying a specific criteria with new_value.

Effects: Assigns to every iterator it in the range [result, result + (last - first)) either new_value or *(first + (it - result)) depending on whether the following corresponding condition holds: INVOKE(f, *(first + (i - result))) != false

The assignments in the parallel replace_copy_if algorithm execute in sequential order in the calling thread.

Note

Complexity: Performs exactly last - first applications of the predicate.

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • OutIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an output iterator.

  • Pred – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of equal requires Pred to meet the requirements of CopyConstructible. (deduced).

  • T – The type of the new values to replace (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • pred – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last).This is an unary predicate which returns true for the elements which need to replaced. The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type InIter can be dereferenced and then implicitly converted to Type.

  • new_value – Refers to the new value to use as the replacement.

Returns

The replace_copy_if algorithm returns an OutIter. The replace_copy_if algorithm returns the output iterator to the element in the destination range, one past the last element copied.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename Pred, typename T = typename std::iterator_traits<FwdIter2>::value_type>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> replace_copy_if(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest, Pred &&pred, T const &new_value)#

Copies the all elements from the range [first, last) to another range beginning at dest replacing all elements satisfying a specific criteria with new_value.

Effects: Assigns to every iterator it in the range [result, result + (last - first)) either new_value or *(first + (it - result)) depending on whether the following corresponding condition holds: INVOKE(f, *(first + (i - result))) != false

The assignments in the parallel replace_copy_if algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel replace_copy_if algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly last - first applications of the predicate.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator.

  • Pred – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of replace_copy_if requires Pred to meet the requirements of CopyConstructible. (deduced).

  • T – The type of the new values to replace (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • pred – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last).This is an unary predicate which returns true for the elements which need to replaced. The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter1 can be dereferenced and then implicitly converted to Type.

  • new_value – Refers to the new value to use as the replacement.

Returns

The replace_copy_if algorithm returns a hpx::future<FwdIter2> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter2 otherwise. The replace_copy_if algorithm returns the iterator to the element in the destination range, one past the last element copied.

hpx::reverse, hpx::reverse_copy#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename BidirIter>
void reverse(BidirIter first, BidirIter last)#

Reverses the order of the elements in the range [first, last). Behaves as if applying std::iter_swap to every pair of iterators first+i, (last-i) - 1 for each non-negative i < (last-first)/2.

The assignments in the parallel reverse algorithm execute in sequential order in the calling thread.

Note

Complexity: Linear in the distance between first and last.

Template Parameters

BidirIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a bidirectional iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

Returns

The reverse algorithm returns void.

template<typename ExPolicy, typename BidirIter>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, void> reverse(ExPolicy &&policy, BidirIter first, BidirIter last)#

Reverses the order of the elements in the range [first, last). Behaves as if applying std::iter_swap to every pair of iterators first+i, (last-i) - 1 for each non-negative i < (last-first)/2. Executed according to the policy.

The assignments in the parallel reverse algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel reverse algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Linear in the distance between first and last.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • BidirIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a bidirectional iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

Returns

The reverse algorithm returns a hpx::future<void> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns void otherwise.

template<typename BidirIter, typename OutIter>
OutIter reverse_copy(BidirIter first, BidirIter last, OutIter dest)#

Copies the elements from the range [first, last) to another range beginning at dest in such a way that the elements in the new range are in reverse order. Behaves as if by executing the assignment *(dest + (last - first) - 1 - i) = *(first + i) once for each non-negative i < (last - first) If the source and destination ranges (that is, [first, last) and [dest, dest+(last-first)) respectively) overlap, the behavior is undefined.

The assignments in the parallel reverse_copy algorithm execute in sequential order in the calling thread.

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • BidirIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a bidirectional iterator.

  • OutIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an output iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the begin of the destination range.

Returns

The reverse_copy algorithm returns an OutIter. The reverse_copy algorithm returns the output iterator to the element in the destination range, one past the last element copied.

template<typename ExPolicy, typename BidirIter, typename FwdIter>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter> reverse_copy(ExPolicy &&policy, BidirIter first, BidirIter last, FwdIter dest)#

Copies the elements from the range [first, last) to another range beginning at dest in such a way that the elements in the new range are in reverse order. Behaves as if by executing the assignment *(dest + (last - first) - 1 - i) = *(first + i) once for each non-negative i < (last - first) If the source and destination ranges (that is, [first, last) and [dest, dest+(last-first)) respectively) overlap, the behavior is undefined. Executed according to the policy.

The assignments in the parallel reverse_copy algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel reverse_copy algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • BidirIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a bidirectional iterator.

  • FwdIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the begin of the destination range.

Returns

The reverse_copy algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The reverse_copy algorithm returns the output iterator to the element in the destination range, one past the last element copied.

hpx::rotate, hpx::rotate_copy#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename FwdIter>
FwdIter rotate(FwdIter first, FwdIter new_first, FwdIter last)#

Performs a left rotation on a range of elements. Specifically, rotate swaps the elements in the range [first, last) in such a way that the element new_first becomes the first element of the new range and new_first - 1 becomes the last element.

The assignments in the parallel rotate algorithm execute in sequential order in the calling thread.

Note

Complexity: Linear in the distance between first and last.

Note

The type of dereferenced FwdIter must meet the requirements of MoveAssignable and MoveConstructible.

Template Parameters

FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • new_first – Refers to the element that should appear at the beginning of the rotated range.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

Returns

The rotate algorithm returns a FwdIter. The rotate algorithm returns the iterator to the new location of the element pointed by first,equal to first + (last - new_first).

template<typename ExPolicy, typename FwdIter>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter> rotate(ExPolicy &&policy, FwdIter first, FwdIter new_first, FwdIter last)#

Performs a left rotation on a range of elements. Specifically, rotate swaps the elements in the range [first, last) in such a way that the element new_first becomes the first element of the new range and new_first - 1 becomes the last element. Executed according to the policy.

The assignments in the parallel rotate algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel rotate algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Linear in the distance between first and last.

Note

The type of dereferenced FwdIter must meet the requirements of MoveAssignable and MoveConstructible.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • new_first – Refers to the element that should appear at the beginning of the rotated range.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

Returns

The rotate algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The rotate algorithm returns the iterator equal to first + (last - new_first).

template<typename FwdIter, typename OutIter>
OutIter rotate_copy(FwdIter first, FwdIter new_first, FwdIter last, OutIter dest_first)#

Copies the elements from the range [first, last), to another range beginning at dest_first in such a way, that the element new_first becomes the first element of the new range and new_first - 1 becomes the last element.

The assignments in the parallel rotate_copy algorithm execute in sequential order in the calling thread.

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • OutIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a output iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • new_first – Refers to the element that should appear at the beginning of the rotated range.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest_first – Refers to the begin of the destination range.

Returns

The rotate_copy algorithm returns a output iterator, The rotate_copy algorithm returns the output iterator to the element past the last element copied.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> rotate_copy(ExPolicy &&policy, FwdIter1 first, FwdIter1 new_first, FwdIter1 last, FwdIter2 dest_first)#

Copies the elements from the range [first, last), to another range beginning at dest_first in such a way, that the element new_first becomes the first element of the new range and new_first - 1 becomes the last element. Executed according to the policy.

The assignments in the parallel rotate_copy algorithm execute in sequential order in the calling thread.

The assignments in the parallel rotate_copy algorithm execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • new_first – Refers to the element that should appear at the beginning of the rotated range.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest_first – Refers to the begin of the destination range.

Returns

The rotate_copy algorithm returns a hpx::future<FwdIter2> if the execution policy is of type parallel_task_policy and returns FwdIter2 otherwise. The rotate_copy algorithm returns the output iterator to the element past the last element copied.

hpx::search, hpx::search_n#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename FwdIter, typename FwdIter2, typename Pred = parallel::detail::equal_to>
FwdIter search(FwdIter first, FwdIter last, FwdIter2 s_first, FwdIter2 s_last, Pred &&op = Pred())#

Searches the range [first, last) for any elements in the range [s_first, s_last). Uses a provided predicate to compare elements.

The comparison operations in the parallel search algorithm execute in sequential order in the calling thread.

Note

Complexity: at most (S*N) comparisons where S = distance(s_first, s_last) and N = distance(first, last).

Template Parameters
  • FwdIter – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of a forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of search requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

Parameters
  • first – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • s_first – Refers to the beginning of the sequence of elements the algorithm will be searching for.

  • s_last – Refers to the end of the sequence of elements of the algorithm will be searching for.

  • op – Refers to the binary predicate which returns true if the elements should be treated as equal. the signature of the function should be equivalent to

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively

Returns

The search algorithm returns a hpx::future<FwdIter> if the execution policy is of type task_execution_policy and returns FwdIter otherwise. The search algorithm returns an iterator to the beginning of the first subsequence [s_first, s_last) in range [first, last). If the length of the subsequence [s_first, s_last) is greater than the length of the range [first, last), last is returned. Additionally if the size of the subsequence is empty first is returned. If no subsequence is found, last is returned.

template<typename ExPolicy, typename FwdIter, typename FwdIter2, typename Pred = parallel::detail::equal_to>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter> search(ExPolicy &&policy, FwdIter first, FwdIter last, FwdIter2 s_first, FwdIter2 s_last, Pred &&op = Pred())#

Searches the range [first, last) for any elements in the range [s_first, s_last). Uses a provided predicate to compare elements. Executed according to the policy.

The comparison operations in the parallel search algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel search algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: at most (S*N) comparisons where S = distance(s_first, s_last) and N = distance(first, last).

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of a forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of search requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • s_first – Refers to the beginning of the sequence of elements the algorithm will be searching for.

  • s_last – Refers to the end of the sequence of elements of the algorithm will be searching for.

  • op – Refers to the binary predicate which returns true if the elements should be treated as equal. the signature of the function should be equivalent to

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively

Returns

The search algorithm returns a hpx::future<FwdIter> if the execution policy is of type task_execution_policy and returns FwdIter otherwise. The search algorithm returns an iterator to the beginning of the first subsequence [s_first, s_last) in range [first, last). If the length of the subsequence [s_first, s_last) is greater than the length of the range [first, last), last is returned. Additionally if the size of the subsequence is empty first is returned. If no subsequence is found, last is returned.

template<typename FwdIter, typename FwdIter2, typename Pred = parallel::detail::equal_to>
FwdIter search_n(FwdIter first, std::size_t count, FwdIter2 s_first, FwdIter2 s_last, Pred &&op = Pred())#

Searches the range [first, last) for any elements in the range [s_first, s_last). Uses a provided predicate to compare elements.

The comparison operations in the parallel search_n algorithm execute in sequential order in the calling thread.

Note

Complexity: at most (S*N) comparisons where S = distance(s_first, s_last) and N = count.

Template Parameters
  • FwdIter – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of a forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of search_n requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

Parameters
  • first – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • count – Refers to the range of elements of the first range the algorithm will be applied to.

  • s_first – Refers to the beginning of the sequence of elements the algorithm will be searching for.

  • s_last – Refers to the end of the sequence of elements of the algorithm will be searching for.

  • op – Refers to the binary predicate which returns true if the elements should be treated as equal. the signature of the function should be equivalent to

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively

Returns

The search_n algorithm returns FwdIter. The search_n algorithm returns an iterator to the beginning of the last subsequence [s_first, s_last) in range [first, first+count). If the length of the subsequence [s_first, s_last) is greater than the length of the range [first, first+count), first is returned. Additionally if the size of the subsequence is empty or no subsequence is found, first is also returned.

template<typename ExPolicy, typename FwdIter, typename FwdIter2, typename Pred = parallel::detail::equal_to>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter> search_n(ExPolicy &&policy, FwdIter first, std::size_t count, FwdIter2 s_first, FwdIter2 s_last, Pred &&op = Pred())#

Searches the range [first, last) for any elements in the range [s_first, s_last). Uses a provided predicate to compare elements. Executed according to the policy.

The comparison operations in the parallel search_n algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel search_n algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: at most (S*N) comparisons where S = distance(s_first, s_last) and N = count.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of a forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of search_n requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • count – Refers to the range of elements of the first range the algorithm will be applied to.

  • s_first – Refers to the beginning of the sequence of elements the algorithm will be searching for.

  • s_last – Refers to the end of the sequence of elements of the algorithm will be searching for.

  • op – Refers to the binary predicate which returns true if the elements should be treated as equal. the signature of the function should be equivalent to

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively

Returns

The search_n algorithm returns a hpx::future<FwdIter> if the execution policy is of type task_execution_policy and returns FwdIter otherwise. The search_n algorithm returns an iterator to the beginning of the last subsequence [s_first, s_last) in range [first, first+count). If the length of the subsequence [s_first, s_last) is greater than the length of the range [first, first+count), first is returned. Additionally if the size of the subsequence is empty or no subsequence is found, first is also returned.

hpx::set_difference#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename FwdIter3, typename Pred = hpx::parallel::detail::less>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter3> set_difference(ExPolicy &&policy, FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2, FwdIter3 dest, Pred &&op = Pred())#

Constructs a sorted range beginning at dest consisting of all elements present in the range [first1, last1) and not present in the range [first2, last2). This algorithm expects both input ranges to be sorted with the given binary predicate pred. Executed according to the policy.

Equivalent elements are treated individually, that is, if some element is found m times in [first1, last1) and n times in [first2, last2), it will be copied to dest exactly std::max(m-n, 0) times. The resulting range cannot overlap with either of the input ranges.

The resulting range cannot overlap with either of the input ranges.

The application of function objects in parallel algorithm invoked with a sequential execution policy object execute in sequential order in the calling thread (sequenced_policy) or in a single new thread spawned from the current thread (for sequenced_task_policy).

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most 2*(N1 + N2 - 1) comparisons, where N1 is the length of the first sequence and N2 is the length of the second sequence.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • FwdIter1 – The type of the source iterators used (deduced) representing the first sequence. This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the source iterators used (deduced) representing the first sequence. This iterator type must meet the requirements of a forward iterator.

  • FwdIter3 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a output iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of set_difference requires Pred to meet the requirements of CopyConstructible. This defaults to std::less<>

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • op – The binary predicate which returns true if the elements should be treated as equal. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1

Returns

The set_difference algorithm returns a hpx::future<FwdIter3> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter3 otherwise. The set_difference algorithm returns the output iterator to the element in the destination range, one past the last element copied.

template<typename FwdIter1, typename FwdIter2, typename FwdIter3, typename Pred = hpx::parallel::detail::less>
FwdIter3 set_difference(FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2, FwdIter3 dest, Pred &&op = Pred())#

Constructs a sorted range beginning at dest consisting of all elements present in the range [first1, last1) and not present in the range [first2, last2). This algorithm expects both input ranges to be sorted with the given binary predicate pred.

Equivalent elements are treated individually, that is, if some element is found m times in [first1, last1) and n times in [first2, last2), it will be copied to dest exactly std::max(m-n, 0) times. The resulting range cannot overlap with either of the input ranges.

The resulting range cannot overlap with either of the input ranges.

Note

Complexity: At most 2*(N1 + N2 - 1) comparisons, where N1 is the length of the first sequence and N2 is the length of the second sequence.

Template Parameters
  • FwdIter1 – The type of the source iterators used (deduced) representing the first sequence. This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the source iterators used (deduced) representing the first sequence. This iterator type must meet the requirements of a forward iterator.

  • FwdIter3 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a output iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of set_difference requires Pred to meet the requirements of CopyConstructible. This defaults to std::less<>

Parameters
  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • op – The binary predicate which returns true if the elements should be treated as equal. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1

Returns

The set_difference algorithm returns a FwdIter3. The set_difference algorithm returns the output iterator to the element in the destination range, one past the last element copied.

hpx::set_intersection#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename FwdIter3, typename Pred = hpx::parallel::detail::less>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter3> set_intersection(ExPolicy &&policy, FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2, FwdIter3 dest, Pred &&op = Pred())#

Constructs a sorted range beginning at dest consisting of all elements present in both sorted ranges [first1, last1) and [first2, last2). This algorithm expects both input ranges to be sorted with the given binary predicate pred. Executed according to the policy.

If some element is found m times in [first1, last1) and n times in [first2, last2), the first std::min(m, n) elements will be copied from the first range to the destination range. The order of equivalent elements is preserved. The resulting range cannot overlap with either of the input ranges.

The resulting range cannot overlap with either of the input ranges.

The application of function objects in parallel algorithm invoked with a sequential execution policy object execute in sequential order in the calling thread (sequenced_policy) or in a single new thread spawned from the current thread (for sequenced_task_policy).

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most 2*(N1 + N2 - 1) comparisons, where N1 is the length of the first sequence and N2 is the length of the second sequence.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • FwdIter1 – The type of the source iterators used (deduced) representing the first sequence. This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the source iterators used (deduced) representing the first sequence. This iterator type must meet the requirements of a forward iterator.

  • FwdIter3 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator or output iterator with sequential execution.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of set_intersection requires Pred to meet the requirements of CopyConstructible. This defaults to std::less<>

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • op – The binary predicate which returns true if the elements should be treated as equal. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1

Returns

The set_intersection algorithm returns a hpx::future<FwdIter3> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter3 otherwise. The set_intersection algorithm returns the output iterator to the element in the destination range, one past the last element copied.

template<typename FwdIter1, typename FwdIter2, typename FwdIter3, typename Pred = hpx::parallel::detail::less>
FwdIter3 set_intersection(FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2, FwdIter3 dest, Pred &&op = Pred())#

Constructs a sorted range beginning at dest consisting of all elements present in both sorted ranges [first1, last1) and [first2, last2). This algorithm expects both input ranges to be sorted with the given binary predicate pred.

If some element is found m times in [first1, last1) and n times in [first2, last2), the first std::min(m, n) elements will be copied from the first range to the destination range. The order of equivalent elements is preserved. The resulting range cannot overlap with either of the input ranges.

The resulting range cannot overlap with either of the input ranges.

Note

Complexity: At most 2*(N1 + N2 - 1) comparisons, where N1 is the length of the first sequence and N2 is the length of the second sequence.

Template Parameters
  • FwdIter1 – The type of the source iterators used (deduced) representing the first sequence. This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the source iterators used (deduced) representing the first sequence. This iterator type must meet the requirements of a forward iterator.

  • FwdIter3 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator or output iterator with sequential execution.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of set_intersection requires Pred to meet the requirements of CopyConstructible. This defaults to std::less<>

Parameters
  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • op – The binary predicate which returns true if the elements should be treated as equal. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1

Returns

The set_intersection algorithm returns a FwdIter3. The set_intersection algorithm returns the output iterator to the element in the destination range, one past the last element copied.

hpx::set_symmetric_difference#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename FwdIter3, typename Pred = hpx::parallel::detail::less>
hpx::parallel::util::detail::algorithm_result<ExPolicy, FwdIter3>::type set_symmetric_difference(ExPolicy &&policy, FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2, FwdIter3 dest, Pred &&op = Pred())#

Constructs a sorted range beginning at dest consisting of all elements present in either of the sorted ranges [first1, last1) and [first2, last2), but not in both of them are copied to the range beginning at dest. The resulting range is also sorted. This algorithm expects both input ranges to be sorted with the given binary predicate pred. Executed according to the policy.

If some element is found m times in [first1, last1) and n times in [first2, last2), it will be copied to dest exactly std::abs(m-n) times. If m>n, then the last m-n of those elements are copied from [first1,last1), otherwise the last n-m elements are copied from [first2,last2). The resulting range cannot overlap with either of the input ranges.

The resulting range cannot overlap with either of the input ranges.

The application of function objects in parallel algorithm invoked with a sequential execution policy object execute in sequential order in the calling thread (sequenced_policy) or in a single new thread spawned from the current thread (for sequenced_task_policy).

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most 2*(N1 + N2 - 1) comparisons, where N1 is the length of the first sequence and N2 is the length of the second sequence.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • FwdIter1 – The type of the source iterators used (deduced) representing the first sequence. This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the source iterators used (deduced) representing the first sequence. This iterator type must meet the requirements of a forward iterator.

  • FwdIter3 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator or output iterator and sequential execution.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of set_symmetric_difference requires Pred to meet the requirements of CopyConstructible. This defaults to std::less<>

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • op – The binary predicate which returns true if the elements should be treated as equal. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1

Returns

The set_symmetric_difference algorithm returns a hpx::future<FwdIter3> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter3 otherwise. The set_symmetric_difference algorithm returns the output iterator to the element in the destination range, one past the last element copied.

template<typename FwdIter1, typename FwdIter2, typename FwdIter3, typename Pred = hpx::parallel::detail::less>
FwdIter3 set_symmetric_difference(FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2, FwdIter3 dest, Pred &&op = Pred())#

Constructs a sorted range beginning at dest consisting of all elements present in either of the sorted ranges [first1, last1) and [first2, last2), but not in both of them are copied to the range beginning at dest. The resulting range is also sorted. This algorithm expects both input ranges to be sorted with the given binary predicate pred.

If some element is found m times in [first1, last1) and n times in [first2, last2), it will be copied to dest exactly std::abs(m-n) times. If m>n, then the last m-n of those elements are copied from [first1,last1), otherwise the last n-m elements are copied from [first2,last2). The resulting range cannot overlap with either of the input ranges.

The resulting range cannot overlap with either of the input ranges.

Note

Complexity: At most 2*(N1 + N2 - 1) comparisons, where N1 is the length of the first sequence and N2 is the length of the second sequence.

Template Parameters
  • FwdIter1 – The type of the source iterators used (deduced) representing the first sequence. This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the source iterators used (deduced) representing the first sequence. This iterator type must meet the requirements of a forward iterator.

  • FwdIter3 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator or output iterator and sequential execution.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of set_symmetric_difference requires Pred to meet the requirements of CopyConstructible. This defaults to std::less<>

Parameters
  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • op – The binary predicate which returns true if the elements should be treated as equal. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1

Returns

The set_symmetric_difference algorithm returns a FwdIter3. The set_symmetric_difference algorithm returns the output iterator to the element in the destination range, one past the last element copied.

hpx::set_union#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename FwdIter3, typename Pred = hpx::parallel::detail::less>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter3> set_union(ExPolicy &&policy, FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2, FwdIter3 dest, Pred &&op = Pred())#

Constructs a sorted range beginning at dest consisting of all elements present in one or both sorted ranges [first1, last1) and [first2, last2). This algorithm expects both input ranges to be sorted with the given binary predicate pred. Executed according to the policy.

If some element is found m times in [first1, last1) and n times in [first2, last2), then all m elements will be copied from [first1, last1) to dest, preserving order, and then exactly std::max(n-m, 0) elements will be copied from [first2, last2) to dest, also preserving order.

The resulting range cannot overlap with either of the input ranges.

The application of function objects in parallel algorithm invoked with a sequential execution policy object execute in sequential order in the calling thread (sequenced_policy) or in a single new thread spawned from the current thread (for sequenced_task_policy).

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most 2*(N1 + N2 - 1) comparisons, where N1 is the length of the first sequence and N2 is the length of the second sequence.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • FwdIter1 – The type of the source iterators used (deduced) representing the first sequence. This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the source iterators used (deduced) representing the first sequence. This iterator type must meet the requirements of a forward iterator.

  • FwdIter3 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator or output iterator and sequential execution.

  • Op – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of set_union requires Pred to meet the requirements of CopyConstructible. This defaults to std::less<>

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • op – The binary predicate which returns true if the elements should be treated as equal. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1

Returns

The set_union algorithm returns a hpx::future<FwdIter3> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter3 otherwise. The set_union algorithm returns the output iterator to the element in the destination range, one past the last element copied.

template<typename FwdIter1, typename FwdIter2, typename FwdIter3, typename Pred = hpx::parallel::detail::less>
FwdIter3 set_union(FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter2 last2, FwdIter3 dest, Pred &&op = Pred())#

Constructs a sorted range beginning at dest consisting of all elements present in one or both sorted ranges [first1, last1) and [first2, last2). This algorithm expects both input ranges to be sorted with the given binary predicate pred. Executed according to the policy.

If some element is found m times in [first1, last1) and n times in [first2, last2), then all m elements will be copied from [first1, last1) to dest, preserving order, and then exactly std::max(n-m, 0) elements will be copied from [first2, last2) to dest, also preserving order.

The resulting range cannot overlap with either of the input ranges.

Note

Complexity: At most 2*(N1 + N2 - 1) comparisons, where N1 is the length of the first sequence and N2 is the length of the second sequence.

Template Parameters
  • FwdIter1 – The type of the source iterators used (deduced) representing the first sequence. This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the source iterators used (deduced) representing the first sequence. This iterator type must meet the requirements of a forward iterator.

  • FwdIter3 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator or output iterator and sequential execution.

  • Op – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of set_union requires Pred to meet the requirements of CopyConstructible. This defaults to std::less<>

Parameters
  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • op – The binary predicate which returns true if the elements should be treated as equal. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1

Returns

The set_union algorithm returns a FwdIter3. The set_union algorithm returns the output iterator to the element in the destination range, one past the last element copied.

hpx::shift_left#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename FwdIter, typename Size>
FwdIter shift_left(FwdIter first, FwdIter last, Size n)#

Shifts the elements in the range [first, last) by n positions towards the beginning of the range. For every integer i in [0, last - first

  • n), moves the element originally at position first + n + i to position first + i.

The assignment operations in the parallel shift_left algorithm invoked without an execution policy object will execute in sequential order in the calling thread.

Note

Complexity: At most (last - first) - n assignments.

Note

The type of dereferenced FwdIter must meet the requirements of MoveAssignable.

Template Parameters
  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • Size – The type of the argument specifying the number of positions to shift by.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • n – Refers to the number of positions to shift.

Returns

The shift_left algorithm returns FwdIter. The shift_left algorithm returns an iterator to the end of the resulting range.

template<typename ExPolicy, typename FwdIter, typename Size>
hpx::parallel::util::detail::algorithm_result<ExPolicy, FwdIter> shift_left(ExPolicy &&policy, FwdIter first, FwdIter last, Size n)#

Shifts the elements in the range [first, last) by n positions towards the beginning of the range. For every integer i in [0, last - first

  • n), moves the element originally at position first + n + i to position first + i. Executed according to the policy.

The assignment operations in the parallel shift_left algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignment operations in the parallel shift_left algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most (last - first) - n assignments.

Note

The type of dereferenced FwdIter must meet the requirements of MoveAssignable.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • Size – The type of the argument specifying the number of positions to shift by.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • n – Refers to the number of positions to shift.

Returns

The shift_left algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The shift_left algorithm returns an iterator to the end of the resulting range.

hpx::shift_right#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename FwdIter, typename Size>
FwdIter shift_right(FwdIter first, FwdIter last, Size n)#

Shifts the elements in the range [first, last) by n positions towards the end of the range. For every integer i in [0, last - first - n), moves the element originally at position first + i to position first

  • n + i.

The assignment operations in the parallel shift_right algorithm invoked without an execution policy object will execute in sequential order in the calling thread.

Note

Complexity: At most (last - first) - n assignments.

Note

The type of dereferenced FwdIter must meet the requirements of MoveAssignable.

Template Parameters
  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • Size – The type of the argument specifying the number of positions to shift by.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • n – Refers to the number of positions to shift.

Returns

The shift_right algorithm returns FwdIter. The shift_right algorithm returns an iterator to the end of the resulting range.

template<typename ExPolicy, typename FwdIter, typename Size>
hpx::parallel::util::detail::algorithm_result<ExPolicy, FwdIter> shift_right(ExPolicy &&policy, FwdIter first, FwdIter last, Size n)#

Shifts the elements in the range [first, last) by n positions towards the end of the range. For every integer i in [0, last - first - n), moves the element originally at position first + i to position first

  • n + i. Executed according to the policy.

The assignment operations in the parallel shift_right algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignment operations in the parallel shift_right algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most (last - first) - n assignments.

Note

The type of dereferenced FwdIter must meet the requirements of MoveAssignable.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • Size – The type of the argument specifying the number of positions to shift by.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • n – Refers to the number of positions to shift.

Returns

The shift_right algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The shift_right algorithm returns an iterator to the end of the resulting range.

hpx::sort#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename RandomIt, typename Comp = hpx::parallel::detail::less, typename Proj = hpx::identity>
void sort(RandomIt first, RandomIt last, Comp &&comp, Proj &&proj = Proj())#

Sorts the elements in the range [first, last) in ascending order. The order of equal elements is not guaranteed to be preserved. The function uses the given comparison function object comp (defaults to using operator<()).

A sequence is sorted with respect to a comparator comp and a projection proj if for every iterator i pointing to the sequence and every non-negative integer n such that i + n is a valid iterator pointing to an element of the sequence, and INVOKE(comp, INVOKE(proj, *(i + n)), INVOKE(proj, *i)) == false.

comp has to induce a strict weak ordering on the values.

The assignments in the parallel sort algorithm invoked without an execution policy object execute in sequential order in the calling thread.

Note

Complexity: O(N log(N)), where N = std::distance(first, last) comparisons.

Template Parameters
  • RandomIt – The type of the source iterators used (deduced). This iterator type must meet the requirements of a random access iterator.

  • Comp – The type of the function/function object to use (deduced).

  • Proj – The type of an optional projection function. This defaults to hpx::identity.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • comp – comp is a callable object. The return value of the INVOKE operation applied to an object of type Comp, when contextually converted to bool, yields true if the first argument of the call is less than the second, and false otherwise. It is assumed that comp will not apply any non-constant function through the dereferenced iterator.

  • proj – Specifies the function (or function object) which will be invoked for each pair of elements as a projection operation before the actual predicate comp is invoked.

Returns

The sort algorithm returns void.

template<typename ExPolicy, typename RandomIt, typename Comp = hpx::parallel::detail::less, typename Proj = hpx::identity>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy> sort(ExPolicy &&policy, RandomIt first, RandomIt last, Comp &&comp, Proj &&proj)#

Sorts the elements in the range [first, last) in ascending order. The order of equal elements is not guaranteed to be preserved. The function uses the given comparison function object comp (defaults to using operator<()). Executed according to the policy.

A sequence is sorted with respect to a comparator comp and a projection proj if for every iterator i pointing to the sequence and every non-negative integer n such that i + n is a valid iterator pointing to an element of the sequence, and INVOKE(comp, INVOKE(proj, *(i + n)), INVOKE(proj, *i)) == false.

comp has to induce a strict weak ordering on the values.

The application of function objects in parallel algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: O(N log(N)), where N = std::distance(first, last) comparisons.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • RandomIt – The type of the source iterators used (deduced). This iterator type must meet the requirements of a random access iterator.

  • Comp – The type of the function/function object to use (deduced).

  • Proj – The type of an optional projection function. This defaults to hpx::identity.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • comp – comp is a callable object. The return value of the INVOKE operation applied to an object of type Comp, when contextually converted to bool, yields true if the first argument of the call is less than the second, and false otherwise. It is assumed that comp will not apply any non-constant function through the dereferenced iterator.

  • proj – Specifies the function (or function object) which will be invoked for each pair of elements as a projection operation before the actual predicate comp is invoked.

Returns

The sort algorithm returns a hpx::future<void> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns void otherwise.

hpx::experimental::sort_by_key#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace experimental

Top-level namespace.

Functions

template<typename ExPolicy, typename KeyIter, typename ValueIter, typename Compare = detail::less>
util::detail::algorithm_result_t<ExPolicy, sort_by_key_result<KeyIter, ValueIter>> sort_by_key(ExPolicy &&policy, KeyIter key_first, KeyIter key_last, ValueIter value_first, Compare &&comp = Compare())#

Sorts one range of data using keys supplied in another range. The key elements in the range [key_first, key_last) are sorted in ascending order with the corresponding elements in the value range moved to follow the sorted order. The algorithm is not stable, the order of equal elements is not guaranteed to be preserved. The function uses the given comparison function object comp (defaults to using operator<()). Executed according to the policy.

A sequence is sorted with respect to a comparator comp if for every iterator i pointing to the sequence and every non-negative integer n such that i + n is a valid iterator pointing to an element of the sequence, and INVOKE(comp, *(i + n), *i) == false.

comp has to induce a strict weak ordering on the values.

The application of function objects in parallel algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: O(N log(N)), where N = std::distance(first, last) comparisons.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • KeyIter – The type of the key iterators used (deduced). This iterator type must meet the requirements of a random access iterator.

  • ValueIter – The type of the value iterators used (deduced). This iterator type must meet the requirements of a random access iterator.

  • Compare – The type of the function/function object to use (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • key_first – Refers to the beginning of the sequence of key elements the algorithm will be applied to.

  • key_last – Refers to the end of the sequence of key elements the algorithm will be applied to.

  • value_first – Refers to the beginning of the sequence of value elements the algorithm will be applied to, the range of elements must match [key_first, key_last)

  • comp – comp is a callable object. The return value of the INVOKE operation applied to an object of type Comp, when contextually converted to bool, yields true if the first argument of the call is less than the second, and false otherwise. It is assumed that comp will not apply any non-constant function through the dereferenced iterator.

Returns

The sort_by_key algorithm returns a hpx::future<sort_by_key_result<KeyIter,ValueIter>> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns otherwise. The algorithm returns a pair holding an iterator pointing to the first element after the last element in the input key sequence and an iterator pointing to the first element after the last element in the input value sequence.

hpx::stable_sort#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename RandomIt, typename Comp = hpx::parallel::detail::less, typename Proj = hpx::identity>
void stable_sort(RandomIt first, RandomIt last, Comp &&comp = Comp(), Proj &&proj = Proj())#

Sorts the elements in the range [first, last) in ascending order. The relative order of equal elements is preserved. The function uses the given comparison function object comp (defaults to using operator<()).

A sequence is sorted with respect to a comparator comp and a projection proj if for every iterator i pointing to the sequence and every non-negative integer n such that i + n is a valid iterator pointing to an element of the sequence, and INVOKE(comp, INVOKE(proj, *(i + n)), INVOKE(proj, *i)) == false.

comp has to induce a strict weak ordering on the values.

The assignments in the parallel stable_sort algorithm invoked without an execution policy object execute in sequential order in the calling thread.

Note

Complexity: O(N log(N)), where N = std::distance(first, last) comparisons.

Template Parameters
  • RandomIt – The type of the source iterators used (deduced). This iterator type must meet the requirements of a random access iterator.

  • Comp – The type of the function/function object to use (deduced).

  • Proj – The type of an optional projection function. This defaults to hpx::identity.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • comp – comp is a callable object. The return value of the INVOKE operation applied to an object of type Comp, when contextually converted to bool, yields true if the first argument of the call is less than the second, and false otherwise. It is assumed that comp will not apply any non-constant function through the dereferenced iterator.

  • proj – Specifies the function (or function object) which will be invoked for each pair of elements as a projection operation before the actual predicate comp is invoked.

Returns

The stable_sort algorithm returns void.

template<typename ExPolicy, typename RandomIt, typename Comp = hpx::parallel::detail::less, typename Proj = hpx::identity>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy> stable_sort(ExPolicy &&policy, RandomIt first, RandomIt last, Comp &&comp = Comp(), Proj &&proj = Proj())#

Sorts the elements in the range [first, last) in ascending order. The relative order of equal elements is preserved. The function uses the given comparison function object comp (defaults to using operator<()). Executed according to the policy.

A sequence is sorted with respect to a comparator comp and a projection proj if for every iterator i pointing to the sequence and every non-negative integer n such that i + n is a valid iterator pointing to an element of the sequence, and INVOKE(comp, INVOKE(proj, *(i + n)), INVOKE(proj, *i)) == false.

comp has to induce a strict weak ordering on the values.

The application of function objects in parallel algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: O(N log(N)), where N = std::distance(first, last) comparisons.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • RandomIt – The type of the source iterators used (deduced). This iterator type must meet the requirements of a random access iterator.

  • Comp – The type of the function/function object to use (deduced).

  • Proj – The type of an optional projection function. This defaults to hpx::identity.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • comp – comp is a callable object. The return value of the INVOKE operation applied to an object of type Comp, when contextually converted to bool, yields true if the first argument of the call is less than the second, and false otherwise. It is assumed that comp will not apply any non-constant function through the dereferenced iterator.

  • proj – Specifies the function (or function object) which will be invoked for each pair of elements as a projection operation before the actual predicate comp is invoked.

Returns

The stable_sort algorithm returns a hpx::future<void> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns void otherwise.

hpx::starts_with#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename InIter1, typename InIter2, typename Pred = hpx::parallel::detail::equal_to, typename Proj1 = hpx::identity, typename Proj2 = hpx::identity>
bool starts_with(InIter1 first1, InIter1 last1, InIter2 first2, InIter2 last2, Pred &&pred = Pred(), Proj1 &&proj1 = Proj1(), Proj2 &&proj2 = Proj2())#

Checks whether the second range defined by [first1, last1) matches the prefix of the first range defined by [first2, last2)

The assignments in the parallel starts_with algorithm invoked without an execution policy object execute in sequential order in the calling thread.

Note

Complexity: Linear: at most min(N1, N2) applications of the predicate and both projections.

Template Parameters
  • InIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • InIter2 – The type of the destination iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • Pred – The binary predicate that compares the projected elements. This defaults to hpx::parallel::detail::equal_to.

  • Proj1 – The type of an optional projection function for the source range. This defaults to hpx::identity.

  • Proj2 – The type of an optional projection function for the destination range. This defaults to hpx::identity.

Parameters
  • first1 – Refers to the beginning of the source range.

  • last1 – Sentinel value referring to the end of the source range.

  • first2 – Refers to the beginning of the destination range.

  • last2 – Sentinel value referring to the end of the destination range.

  • pred – Specifies the binary predicate function (or function object) which will be invoked for comparison of the elements in the in two ranges projected by proj1 and proj2 respectively.

  • proj1 – Specifies the function (or function object) which will be invoked for each of the elements in the source range as a projection operation before the actual predicate pred is invoked.

  • proj2 – Specifies the function (or function object) which will be invoked for each of the elements in the destination range as a projection operation before the actual predicate pred is invoked.

Returns

The starts_with algorithm returns bool. The starts_with algorithm returns a boolean with the value true if the second range matches the prefix of the first range, false otherwise.

template<typename ExPolicy, typename InIter1, typename InIter2, typename Pred = hpx::parallel::detail::equal_to, typename Proj1 = hpx::identity, typename Proj2 = hpx::identity>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, bool> starts_with(ExPolicy &&policy, InIter1 first1, InIter1 last1, InIter2 first2, InIter2 last2, Pred &&pred = Pred(), Proj1 &&proj1 = Proj1(), Proj2 &&proj2 = Proj2())#

Checks whether the second range defined by [first1, last1) matches the prefix of the first range defined by [first2, last2). Executed according to the policy.

The assignments in the parallel starts_with algorithm invoked without an execution policy object execute in sequential order in the calling thread.

Note

Complexity: Linear: at most min(N1, N2) applications of the predicate and both projections.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • InIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • InIter2 – The type of the destination iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • Pred – The binary predicate that compares the projected elements. This defaults to hpx::parallel::detail::equal_to.

  • Proj1 – The type of an optional projection function for the source range. This defaults to hpx::identity.

  • Proj2 – The type of an optional projection function for the destination range. This defaults to hpx::identity.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the source range.

  • last1 – Sentinel value referring to the end of the source range.

  • first2 – Refers to the beginning of the destination range.

  • last2 – Sentinel value referring to the end of the destination range.

  • pred – Specifies the binary predicate function (or function object) which will be invoked for comparison of the elements in the in two ranges projected by proj1 and proj2 respectively.

  • proj1 – Specifies the function (or function object) which will be invoked for each of the elements in the source range as a projection operation before the actual predicate pred is invoked.

  • proj2 – Specifies the function (or function object) which will be invoked for each of the elements in the destination range as a projection operation before the actual predicate pred is invoked.

Returns

The starts_with algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The starts_with algorithm returns a boolean with the value true if the second range matches the prefix of the first range, false otherwise.

hpx::swap_ranges#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename FwdIter1, typename FwdIter2>
FwdIter2 swap_ranges(FwdIter1 first1, FwdIter1 last1, FwdIter2 first2)#

Exchanges elements between range [first1, last1) and another range starting at first2.

The swap operations in the parallel swap_ranges algorithm invoked without an execution policy object execute in sequential order in the calling thread.

Note

Complexity: Linear in the distance between first1 and last1.

Template Parameters
  • FwdIter1 – The type of the first range of iterators to swap (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the second range of iterators to swap (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • first1 – Refers to the beginning of the first sequence of elements the algorithm will be applied to.

  • last1 – Refers to the end of the first sequence of elements the algorithm will be applied to.

  • first2 – Refers to the beginning of the second sequence of elements the algorithm will be applied to.

Returns

The swap_ranges algorithm returns FwdIter2. The swap_ranges algorithm returns iterator to the element past the last element exchanged in the range beginning with first2.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> swap_ranges(ExPolicy &&policy, FwdIter1 first1, FwdIter1 last1, FwdIter2 first2)#

Exchanges elements between range [first1, last1) and another range starting at first2. Executed according to the policy.

The swap operations in the parallel swap_ranges algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The swap operations in the parallel swap_ranges algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Linear in the distance between first1 and last1.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the swap operations.

  • FwdIter1 – The type of the first range of iterators to swap (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the second range of iterators to swap (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the first sequence of elements the algorithm will be applied to.

  • last1 – Refers to the end of the first sequence of elements the algorithm will be applied to.

  • first2 – Refers to the beginning of the second sequence of elements the algorithm will be applied to.

Returns

The swap_ranges algorithm returns a hpx::future<FwdIter2> if the execution policy is of type parallel_task_policy and returns FwdIter2 otherwise. The swap_ranges algorithm returns iterator to the element past the last element exchanged in the range beginning with first2.

hpx::transform#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename FwdIter1, typename FwdIter2, typename F>
FwdIter2 transform(FwdIter1 first, FwdIter1 last, FwdIter2 dest, F &&f)#

Applies the given function f to the range [first, last) and stores the result in another range, beginning at dest.

Note

Complexity: Exactly last - first applications of f

Template Parameters
  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of transform requires F to meet the requirements of CopyConstructible.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last).This is an unary predicate. The signature of this predicate should be equivalent to:

    Ret fun(const Type &a);
    
    The signature does not need to have const&. The type Type must be such that an object of type FwdIter1 can be dereferenced and then implicitly converted to Type. The type Ret must be such that an object of type FwdIter2 can be dereferenced and assigned a value of type Ret.

Returns

The transform algorithm returns a FwdIter2. The transform algorithm returns a tuple holding an iterator referring to the first element after the input sequence and the output iterator to the element in the destination range, one past the last element copied.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename F>
parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> transform(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest, F &&f)#

Applies the given function f to the range [first, last) and stores the result in another range, beginning at dest. Executed according to the policy.

The invocations of f in the parallel transform algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The invocations of f in the parallel transform algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Exactly last - first applications of f

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the invocations of f.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of transform requires F to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last).This is an unary predicate. The signature of this predicate should be equivalent to:

    Ret fun(const Type &a);
    
    The signature does not need to have const&. The type Type must be such that an object of type FwdIter1 can be dereferenced and then implicitly converted to Type. The type Ret must be such that an object of type FwdIter2 can be dereferenced and assigned a value of type Ret.

Returns

The transform algorithm returns a hpx::future<FwdIter2> if the execution policy is of type parallel_task_policy and returns FwdIter2 otherwise. The transform algorithm returns a tuple holding an iterator referring to the first element after the input sequence and the output iterator to the element in the destination range, one past the last element copied.

template<typename FwdIter1, typename FwdIter2, typename FwdIter3, typename F>
FwdIter3 transform(FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter3 dest, F &&f)#

Applies the given function f to pairs of elements from two ranges: one defined by [first1, last1) and the other beginning at first2, and stores the result in another range, beginning at dest.

Note

Complexity: Exactly last - first applications of f

Template Parameters
  • FwdIter1 – The type of the source iterators for the first range used (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the source iterators for the second range used (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter3 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of transform requires F to meet the requirements of CopyConstructible.

Parameters
  • first1 – Refers to the beginning of the first sequence of elements the algorithm will be applied to.

  • last1 – Refers to the end of the first sequence of elements the algorithm will be applied to.

  • first2 – Refers to the beginning of the second sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last).This is a binary predicate. The signature of this predicate should be equivalent to:

    Ret fun(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const&. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively. The type Ret must be such that an object of type FwdIter3 can be dereferenced and assigned a value of type Ret.

Returns

The transform algorithm returns a FwdIter3. The transform algorithm returns a tuple holding an iterator referring to the first element after the first input sequence, an iterator referring to the first element after the second input sequence, and the output iterator referring to the element in the destination range, one past the last element copied.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename FwdIter3, typename F>
parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter3> transform(ExPolicy &&policy, FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, FwdIter3 dest, F &&f)#

Applies the given function f to pairs of elements from two ranges: one defined by [first1, last1) and the other beginning at first2, and stores the result in another range, beginning at dest. Executed according to the policy.

The invocations of f in the parallel transform algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The invocations of f in the parallel transform algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Exactly last - first applications of f

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the invocations of f.

  • FwdIter1 – The type of the source iterators for the first range used (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the source iterators for the second range used (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter3 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of transform requires F to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the first sequence of elements the algorithm will be applied to.

  • last1 – Refers to the end of the first sequence of elements the algorithm will be applied to.

  • first2 – Refers to the beginning of the second sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last).This is a binary predicate. The signature of this predicate should be equivalent to:

    Ret fun(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const&. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively. The type Ret must be such that an object of type FwdIter3 can be dereferenced and assigned a value of type Ret.

Returns

The transform algorithm returns a hpx::future<FwdIter3> if the execution policy is of type parallel_task_policy and returns FwdIter3 otherwise. The transform algorithm returns a tuple holding an iterator referring to the first element after the first input sequence, an iterator referring to the first element after the second input sequence, and the output iterator referring to the element in the destination range, one past the last element copied.

hpx::transform_exclusive_scan#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename InIter, typename OutIter, typename BinOp, typename UnOp, typename T = typename std::iterator_traits<InIter>::value_type>
OutIter transform_exclusive_scan(InIter first, InIter last, OutIter dest, T init, BinOp &&binary_op, UnOp &&unary_op)#

Transforms each element in the range [first, last) with unary_op, then computes an exclusive prefix sum operation using binary_op over the resulting range, with init as the initial value, and writes the results to the range beginning at dest. “exclusive” means that the i-th input element is not included in the i-th sum. Formally, assigns through each iterator i in [dest, d_first + (last - first)) the value of the generalized noncommutative sum of init, unary_op(*j)… for every j in [first, first + (i - d_first)) over binary_op, where generalized noncommutative sum GNSUM(op, a1, …, a N) is defined as follows:

  • if N=1, a1

  • if N > 1, op(GNSUM(op, a1, …, aK), GNSUM(op, aM, …, aN)) for any K where 1 < K+1 = M <= N In other words, the summation operations may be performed in arbitrary order, and the behavior is nondeterministic if binary_op is not associative.

The reduce operations in the parallel transform_exclusive_scan algorithm invoked without an execution policy object execute in sequential order in the calling thread.

Neither unary_op nor binary_op shall invalidate iterators or sub-ranges, or modify elements in the ranges [first,last) or [result,result + (last - first)).

The behavior of transform_exclusive_scan may be non-deterministic for a non-associative predicate.

Note

Complexity: O(last - first) applications of each of binary_op and unary_op.

Note

GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aN) is defined as:

  • a1 when N is 1

  • op(GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aK), GENERALIZED_NONCOMMUTATIVE_SUM(op, aM, …, aN) where 1 < K+1 = M <= N.

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • OutIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an output iterator.

  • BinOp – The type of binary_op.

  • UnOp – The type of unary_op.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • init – The initial value for the generalized sum.

  • binary_op – Binary FunctionObject that will be applied to the result of unary_op, the results of other binary_op, and init.

  • unary_op – Unary FunctionObject that will be applied to each element of the input range. The return type must be acceptable as input to binary_op.

Returns

The transform_exclusive_scan algorithm returns a returns OutIter. The transform_exclusive_scan algorithm returns the output iterator to the element in the destination range, one past the last element copied.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename BinOp, typename UnOp, typename T = typename std::iterator_traits<FwdIter1>::value_type>
parallel::util::detail::algorithm_result<ExPolicy, FwdIter2>::type transform_exclusive_scan(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest, T init, BinOp &&binary_op, UnOp &&unary_op)#

Assigns through each iterator i in [result, result + (last - first)) the value of GENERALIZED_NONCOMMUTATIVE_SUM(binary_op, init, conv(*first), …, conv(*(first + (i - result) - 1))). Executed according to the policy.

The reduce operations in the parallel transform_exclusive_scan algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The reduce operations in the parallel transform_exclusive_scan algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Neither unary_op nor binary_op shall invalidate iterators or sub-ranges, or modify elements in the ranges [first,last) or [result,result + (last - first)).

The behavior of transform_exclusive_scan may be non-deterministic for a non-associative predicate.

Note

Complexity: O(last - first) applications of each of binary_op and unary_op.

Note

GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aN) is defined as:

  • a1 when N is 1

  • op(GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aK), GENERALIZED_NONCOMMUTATIVE_SUM(op, aM, …, aN) where 1 < K+1 = M <= N.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

  • BinOp – The type of binary_op.

  • UnOp – The type of unary_op.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • init – The initial value for the generalized sum.

  • binary_op – Binary FunctionObject that will be applied in to the result of unary_op, the results of other binary_op, and init.

  • unary_op – Unary FunctionObject that will be applied to each element of the input range. The return type must be acceptable as input to binary_op.

Returns

The transform_exclusive_scan algorithm returns a hpx::future<FwdIter2> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter2 otherwise. The transform_exclusive_scan algorithm returns the output iterator to the element in the destination range, one past the last element copied.

hpx::transform_inclusive_scan#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename InIter, typename OutIter, typename BinOp, typename UnOp>
OutIter transform_inclusive_scan(InIter first, InIter last, OutIter dest, BinOp &&binary_op, UnOp &&unary_op)#

Assigns through each iterator i in [result, result + (last - first)) the value of GENERALIZED_NONCOMMUTATIVE_SUM(op, conv(*first), …, conv(*(first + (i - result)))).

The reduce operations in the parallel transform_inclusive_scan algorithm invoked without an execution policy object execute in sequential order in the calling thread.

Neither binary_op nor unary_op shall invalidate iterators or sub-ranges, or modify elements in the ranges [first,last) or [result,result + (last - first)).

The difference between inclusive_scan and transform_inclusive_scan is that transform_inclusive_scan includes the ith input element in the ith sum.

Note

Complexity: O(last - first) applications of each of binary_op and unary_op.

Note

GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aN) is defined as:

  • a1 when N is 1

  • op(GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aK), GENERALIZED_NONCOMMUTATIVE_SUM(op, aM, …, aN)) where 1 < K+1 = M <= N.

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • OutIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an output iterator.

  • BinOp – The type of binary_op.

  • UnOp – The type of unary_op.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • binary_op – Binary FunctionObject that will be applied in to the result of unary_op, the results of other binary_op, and init if provided.

  • unary_op – Unary FunctionObject that will be applied to each element of the input range. The return type must be acceptable as input to binary_op.

Returns

The transform_inclusive_scan algorithm returns a returns OutIter. The transform_inclusive_scan algorithm returns the output iterator to the element in the destination range, one past the last element copied.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename BinOp, typename UnOp>
parallel::util::detail::algorithm_result<ExPolicy, FwdIter2>::type transform_inclusive_scan(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest, BinOp &&binary_op, UnOp &&unary_op)#

Assigns through each iterator i in [result, result + (last - first)) the value of GENERALIZED_NONCOMMUTATIVE_SUM(op, conv(*first), …, conv(*(first + (i - result)))). Executed according to the policy.

The reduce operations in the parallel transform_inclusive_scan algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The reduce operations in the parallel transform_inclusive_scan algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Neither binary_op nor unary_op shall invalidate iterators or sub-ranges, or modify elements in the ranges [first,last) or [result,result + (last - first)).

The difference between inclusive_scan and transform_inclusive_scan is that transform_inclusive_scan includes the ith input element in the ith sum.

Note

Complexity: O(last - first) applications of each of binary_op and unary_op.

Note

GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aN) is defined as:

  • a1 when N is 1

  • op(GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aK), GENERALIZED_NONCOMMUTATIVE_SUM(op, aM, …, aN)) where 1 < K+1 = M <= N.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator.

  • BinOp – The type of binary_op.

  • UnOp – The type of unary_op.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • binary_op – Binary FunctionObject that will be applied in to the result of unary_op, the results of other binary_op, and init if provided.

  • unnary_op – Unary FunctionObject that will be applied to each element of the input range. The return type must be acceptable as input to binary_op.

Returns

The transform_inclusive_scan algorithm returns a hpx::future<FwdIter2> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter2 otherwise. The transform_inclusive_scan algorithm returns the output iterator to the element in the destination range, one past the last element copied.

template<typename InIter, typename OutIter, typename BinOp, typename UnOp, typename T = typename std::iterator_traits<InIter>::value_type>
OutIter transform_inclusive_scan(InIter first, InIter last, OutIter dest, BinOp &&binary_op, UnOp &&unary_op, T init)#

Assigns through each iterator i in [result, result + (last - first)) the value of GENERALIZED_NONCOMMUTATIVE_SUM(op, init, conv(*first), …, conv(*(first + (i - result)))).

The reduce operations in the parallel transform_inclusive_scan algorithm invoked without an execution policy object execute in sequential order in the calling thread.

Neither binary_op nor unary_op shall invalidate iterators or sub-ranges, or modify elements in the ranges [first,last) or [result,result + (last - first)).

The difference between inclusive_scan and transform_inclusive_scan is that transform_inclusive_scan includes the ith input element in the ith sum. If binary_op is not mathematically associative, the behavior of transform_inclusive_scan may be non-deterministic.

Note

Complexity: O(last - first) applications of each of binary_op and unary_op.

Note

GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aN) is defined as:

  • a1 when N is 1

  • op(GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aK), GENERALIZED_NONCOMMUTATIVE_SUM(op, aM, …, aN)) where 1 < K+1 = M <= N.

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • OutIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an output iterator.

  • BinOp – The type of binary_op.

  • UnOp – The type of unary_op.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • binary_op – Binary FunctionObject that will be applied in to the result of unary_op, the results of other binary_op, and init if provided.

  • unnary_op – Unary FunctionObject that will be applied to each element of the input range. The return type must be acceptable as input to binary_op.

  • init – The initial value for the generalized sum.

Returns

The transform_inclusive_scan algorithm returns a returns OutIter. The transform_inclusive_scan algorithm returns the output iterator to the element in the destination range, one past the last element copied.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename BinOp, typename UnOp, typename T = typename std::iterator_traits<FwdIter1>::value_type>
parallel::util::detail::algorithm_result<ExPolicy, FwdIter2>::type transform_inclusive_scan(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest, BinOp &&binary_op, UnOp &&unary_op, T init)#

Assigns through each iterator i in [result, result + (last - first)) the value of GENERALIZED_NONCOMMUTATIVE_SUM(op, init, conv(*first), …, conv(*(first + (i - result)))). Executed according to the policy.

The reduce operations in the parallel transform_inclusive_scan algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The reduce operations in the parallel transform_inclusive_scan algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Neither binary_op nor unary_op shall invalidate iterators or sub-ranges, or modify elements in the ranges [first,last) or [result,result + (last - first)).

The difference between inclusive_scan and transform_inclusive_scan is that transform_inclusive_scan includes the ith input element in the ith sum. If binary_op is not mathematically associative, the behavior of transform_inclusive_scan may be non-deterministic.

Note

Complexity: O(last - first) applications of each of binary_op and unary_op.

Note

GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aN) is defined as:

  • a1 when N is 1

  • op(GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aK), GENERALIZED_NONCOMMUTATIVE_SUM(op, aM, …, aN)) where 1 < K+1 = M <= N.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

  • BinOp – The type of binary_op.

  • UnOp – The type of unary_op.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • binary_op – Binary FunctionObject that will be applied in to the result of unary_op, the results of other binary_op, and init if provided.

  • unnary_op – Unary FunctionObject that will be applied to each element of the input range. The return type must be acceptable as input to binary_op.

  • init – The initial value for the generalized sum.

Returns

The transform_inclusive_scan algorithm returns a hpx::future<FwdIter2> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter2 otherwise. The transform_inclusive_scan algorithm returns the output iterator to the element in the destination range, one past the last element copied.

hpx::transform_reduce#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename ExPolicy, typename FwdIter, typename T, typename Reduce, typename Convert>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, T> transform_reduce(ExPolicy &&policy, FwdIter first, FwdIter last, T init, Reduce &&red_op, Convert &&conv_op)#

Returns GENERALIZED_SUM(red_op, init, conv_op(*first), …, conv_op(*(first + (last - first) - 1))). Executed according to the policy.

The reduce operations in the parallel transform_reduce algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The reduce operations in the parallel transform_reduce algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

The difference between transform_reduce and accumulate is that the behavior of transform_reduce may be non-deterministic for non-associative or non-commutative binary predicate.

Note

Complexity: O(last - first) applications of the predicates red_op and conv_op.

Note

GENERALIZED_SUM(op, a1, …, aN) is defined as follows:

  • a1 when N is 1

  • op(GENERALIZED_SUM(op, b1, …, bK), GENERALIZED_SUM(op, bM, …, bN)), where:

    • b1, …, bN may be any permutation of a1, …, aN and

    • 1 < K+1 = M <= N.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

  • Reduce – The type of the binary function object used for the reduction operation.

  • Convert – The type of the unary function object used to transform the elements of the input sequence before invoking the reduce function.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • init – The initial value for the generalized sum.

  • red_op – Specifies the function (or function object) which will be invoked for each of the values returned from the invocation of conv_op. This is a binary predicate. The signature of this predicate should be equivalent to:

    Ret fun(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The types Type1, Type2, and Ret must be such that an object of a type as returned from conv_op can be implicitly converted to any of those types.

  • conv_op – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). This is a unary predicate. The signature of this predicate should be equivalent to:

    R fun(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type. The type R must be such that an object of this type can be implicitly converted to T.

Returns

The transform_reduce algorithm returns a hpx::future<T> if the execution policy is of type parallel_task_policy and returns T otherwise. The transform_reduce algorithm returns the result of the generalized sum over the values returned from conv_op when applied to the elements given by the input range [first, last).

template<typename InIter, typename T, typename Reduce, typename Convert>
T transform_reduce(InIter first, InIter last, T init, Reduce &&red_op, Convert &&conv_op)#

Returns GENERALIZED_SUM(red_op, init, conv_op(*first), …, conv_op(*(first + (last - first) - 1))).

The difference between transform_reduce and accumulate is that the behavior of transform_reduce may be non-deterministic for non-associative or non-commutative binary predicate.

Note

Complexity: O(last - first) applications of the predicates red_op and conv_op.

Note

GENERALIZED_SUM(op, a1, …, aN) is defined as follows:

  • a1 when N is 1

  • op(GENERALIZED_SUM(op, b1, …, bK), GENERALIZED_SUM(op, bM, …, bN)), where:

    • b1, …, bN may be any permutation of a1, …, aN and

    • 1 < K+1 = M <= N.

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

  • Reduce – The type of the binary function object used for the reduction operation.

  • Convert – The type of the unary function object used to transform the elements of the input sequence before invoking the reduce function.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • init – The initial value for the generalized sum.

  • red_op – Specifies the function (or function object) which will be invoked for each of the values returned from the invocation of conv_op. This is a binary predicate. The signature of this predicate should be equivalent to:

    Ret fun(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The types Type1, Type2, and Ret must be such that an object of a type as returned from conv_op can be implicitly converted to any of those types.

  • conv_op – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). This is a unary predicate. The signature of this predicate should be equivalent to:

    R fun(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type InIter can be dereferenced and then implicitly converted to Type. The type R must be such that an object of this type can be implicitly converted to T.

Returns

The transform_reduce algorithm returns a T. The transform_reduce algorithm returns the result of the generalized sum over the values returned from conv_op when applied to the elements given by the input range [first, last).

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename T>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, T> transform_reduce(ExPolicy &&policy, FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, T init)#

Returns the result of accumulating init with the inner products of the pairs formed by the elements of two ranges starting at first1 and first2. Executed according to the policy.

The operations in the parallel transform_reduce algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The operations in the parallel transform_reduce algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: O(last - first) applications each of reduce and transform.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the first source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the second source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • T – The type of the value to be used as return) values (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the first sequence of elements the result will be calculated with.

  • last1 – Refers to the end of the first sequence of elements the algorithm will be applied to.

  • first2 – Refers to the beginning of the second sequence of elements the result will be calculated with.

  • init – The initial value for the sum.

Returns

The transform_reduce algorithm returns a hpx::future<T> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns T otherwise.

template<typename InIter1, typename InIter2, typename T>
T transform_reduce(InIter1 first1, InIter1 last1, InIter2 first2, T init)#

Returns the result of accumulating init with the inner products of the pairs formed by the elements of two ranges starting at first1 and first2.

Note

Complexity: O(last - first) applications each of reduce and transform.

Template Parameters
  • InIter1 – The type of the first source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • InIter2 – The type of the second source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • T – The type of the value to be used as return) values (deduced).

Parameters
  • first1 – Refers to the beginning of the first sequence of elements the result will be calculated with.

  • last1 – Refers to the end of the first sequence of elements the algorithm will be applied to.

  • first2 – Refers to the beginning of the second sequence of elements the result will be calculated with.

  • init – The initial value for the sum.

Returns

The transform_reduce algorithm returns a T.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename T, typename Reduce, typename Convert>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, T> transform_reduce(ExPolicy &&policy, FwdIter1 first1, FwdIter1 last1, FwdIter2 first2, T init, Reduce &&red_op, Convert &&conv_op)#

Returns the result of accumulating init with the inner products of the pairs formed by the elements of two ranges starting at first1 and first2. Executed according to the policy.

The operations in the parallel transform_reduce algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The operations in the parallel transform_reduce algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: O(last - first) applications each of reduce and transform.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the first source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the second source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • T – The type of the value to be used as return) values (deduced).

  • Reduce – The type of the binary function object used for the multiplication operation.

  • Convert – The type of the unary function object used to transform the elements of the input sequence before invoking the reduce function.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the first sequence of elements the result will be calculated with.

  • last1 – Refers to the end of the first sequence of elements the algorithm will be applied to.

  • first2 – Refers to the beginning of the second sequence of elements the result will be calculated with.

  • init – The initial value for the sum.

  • red_op – Specifies the function (or function object) which will be invoked for the initial value and each of the return values of conv_op. This is a binary predicate. The signature of this predicate should be equivalent to should be equivalent to:

    Ret fun(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Ret must be such that it can be implicitly converted to a type of T.

  • conv_op – Specifies the function (or function object) which will be invoked for each of the input values of the sequence. This is a binary predicate. The signature of this predicate should be equivalent to

    Ret fun(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Ret must be such that it can be implicitly converted to an object for the second argument type of red_op.

Returns

The transform_reduce algorithm returns a hpx::future<T> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns T otherwise.

template<typename InIter1, typename InIter2, typename T, typename Reduce, typename Convert>
T transform_reduce(ExPolicy &&policy, InIter1 first1, InIter1 last1, InIter2 first2, T init, Reduce &&red_op, Convert &&conv_op)#

Returns the result of accumulating init with the inner products of the pairs formed by the elements of two ranges starting at first1 and first2.

Note

Complexity: O(last - first) applications each of reduce and transform.

Template Parameters
  • InIter1 – The type of the first source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • InIter2 – The type of the second source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • T – The type of the value to be used as return) values (deduced).

  • Reduce – The type of the binary function object used for the multiplication operation.

  • Convert – The type of the unary function object used to transform the elements of the input sequence before invoking the reduce function.

Parameters
  • first1 – Refers to the beginning of the first sequence of elements the result will be calculated with.

  • last1 – Refers to the end of the first sequence of elements the algorithm will be applied to.

  • first2 – Refers to the beginning of the second sequence of elements the result will be calculated with.

  • init – The initial value for the sum.

  • red_op – Specifies the function (or function object) which will be invoked for the initial value and each of the return values of conv_op. This is a binary predicate. The signature of this predicate should be equivalent to should be equivalent to:

    Ret fun(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Ret must be such that it can be implicitly converted to a type of T.

  • conv_op – Specifies the function (or function object) which will be invoked for each of the input values of the sequence. This is a binary predicate. The signature of this predicate should be equivalent to

    Ret fun(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Ret must be such that it can be implicitly converted to an object for the second argument type of red_op.

Returns

The transform_reduce algorithm returns a T.

hpx/parallel/algorithms/transform_reduce_binary.hpp#

Defined in header hpx/parallel/algorithms/transform_reduce_binary.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

hpx::uninitialized_copy, hpx::uninitialized_copy_n#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename InIter, typename FwdIter>
FwdIter uninitialized_copy(InIter first, InIter last, FwdIter dest)#

Copies the elements in the range, defined by [first, last), to an uninitialized memory area beginning at dest. If an exception is thrown during the copy operation, the function has no effects.

The assignments in the parallel uninitialized_copy algorithm invoked without an execution policy object will execute in sequential order in the calling thread.

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • FwdIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The uninitialized_copy algorithm returns FwdIter. The uninitialized_copy algorithm returns the output iterator to the element in the destination range, one past the last element copied.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> uninitialized_copy(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest)#

Copies the elements in the range, defined by [first, last), to an uninitialized memory area beginning at dest. If an exception is thrown during the copy operation, the function has no effects. Executed according to the policy.

The assignments in the parallel uninitialized_copy algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel uninitialized_copy algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The uninitialized_copy algorithm returns a hpx::future<FwdIter2>, if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter2 otherwise. The uninitialized_copy algorithm returns the output iterator to the element in the destination range, one past the last element copied.

template<typename InIter, typename Size, typename FwdIter>
FwdIter uninitialized_copy_n(InIter first, Size count, FwdIter dest)#

Copies the elements in the range [first, first + count), starting from first and proceeding to first + count - 1., to another range beginning at dest. If an exception is thrown during the copy operation, the function has no effects.

The assignments in the parallel uninitialized_copy_n algorithm invoked without an execution policy object execute in sequential order in the calling thread.

Note

Complexity: Performs exactly count assignments, if count > 0, no assignments otherwise.

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • Size – The type of the argument specifying the number of elements to apply f to.

  • FwdIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The uninitialized_copy_n algorithm returns a returns FwdIter. The uninitialized_copy_n algorithm returns the output iterator to the element in the destination range, one past the last element copied.

template<typename ExPolicy, typename FwdIter1, typename Size, typename FwdIter2>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> uninitialized_copy_n(ExPolicy &&policy, FwdIter1 first, Size count, FwdIter2 dest)#

Copies the elements in the range [first, first + count), starting from first and proceeding to first + count - 1., to another range beginning at dest. If an exception is thrown during the copy operation, the function has no effects.

The assignments in the parallel uninitialized_copy_n algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel uninitialized_copy_n algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly count assignments, if count > 0, no assignments otherwise.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • Size – The type of the argument specifying the number of elements to apply f to.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The uninitialized_copy_n algorithm returns a hpx::future<FwdIter2> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter2 otherwise. The uninitialized_copy_n algorithm returns the output iterator to the element in the destination range, one past the last element copied.

hpx::uninitialized_default_construct, hpx::uninitialized_default_construct_n#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename FwdIter>
void uninitialized_default_construct(FwdIter first, FwdIter last)#

Constructs objects of type typename iterator_traits<ForwardIt> ::value_type in the uninitialized storage designated by the range by default-initialization. If an exception is thrown during the initialization, the function has no effects.

The assignments in the parallel uninitialized_default_construct algorithm invoked without an execution policy object will execute in sequential order in the calling thread.

Note

Complexity: Performs exactly last - first assignments.

Template Parameters

FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

Returns

The uninitialized_default_construct algorithm returns nothing

template<typename ExPolicy, typename FwdIter>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy> uninitialized_default_construct(ExPolicy &&policy, FwdIter first, FwdIter last)#

Constructs objects of type typename iterator_traits<ForwardIt> ::value_type in the uninitialized storage designated by the range by default-initialization. If an exception is thrown during the initialization, the function has no effects. Executed according to the policy.

The assignments in the parallel uninitialized_default_construct algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel uninitialized_default_construct algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

Returns

The uninitialized_default_construct algorithm returns a hpx::future<void>, if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns nothing otherwise.

template<typename FwdIter, typename Size>
FwdIter uninitialized_default_construct_n(FwdIter first, Size count)#

Constructs objects of type typename iterator_traits<ForwardIt> ::value_type in the uninitialized storage designated by the range [first, first + count) by default-initialization. If an exception is thrown during the initialization, the function has no effects.

Note

Complexity: Performs exactly count assignments, if count > 0, no assignments otherwise.

Template Parameters
  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • Size – The type of the argument specifying the number of elements to apply f to.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

Returns

The uninitialized_default_construct_n algorithm returns a returns FwdIter. The uninitialized_default_construct_n algorithm returns the iterator to the element in the source range, one past the last element constructed.

template<typename ExPolicy, typename FwdIter, typename Size>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter> uninitialized_default_construct_n(ExPolicy &&policy, FwdIter first, Size count)#

Constructs objects of type typename iterator_traits<ForwardIt> ::value_type in the uninitialized storage designated by the range [first, first + count) by default-initialization. If an exception is thrown during the initialization, the function has no effects. Executed according to the policy.

The assignments in the parallel uninitialized_default_construct_n algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel uninitialized_default_construct_n algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly count assignments, if count > 0, no assignments otherwise.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Size – The type of the argument specifying the number of elements to apply f to.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

Returns

The uninitialized_default_construct_n algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The uninitialized_default_construct_n algorithm returns the iterator to the element in the source range, one past the last element constructed.

hpx::uninitialized_fill, hpx::uninitialized_fill_n#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename FwdIter, typename T>
void uninitialized_fill(FwdIter first, FwdIter last, T const &value)#

Copies the given value to an uninitialized memory area, defined by the range [first, last). If an exception is thrown during the initialization, the function has no effects.

Note

Complexity: Linear in the distance between first and last

Template Parameters
  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • T – The type of the value to be assigned (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • value – The value to be assigned.

Returns

The uninitialized_fill algorithm returns nothing

template<typename ExPolicy, typename FwdIter, typename T>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy> uninitialized_fill(ExPolicy &&policy, FwdIter first, FwdIter last, T const &value)#

Copies the given value to an uninitialized memory area, defined by the range [first, last). If an exception is thrown during the initialization, the function has no effects. Executed according to the policy.

The initializations in the parallel uninitialized_fill algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The initializations in the parallel uninitialized_fill algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Linear in the distance between first and last

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • T – The type of the value to be assigned (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • value – The value to be assigned.

Returns

The uninitialized_fill algorithm returns a hpx::future<void>, if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns nothing otherwise.

template<typename FwdIter, typename Size, typename T>
FwdIter uninitialized_fill_n(FwdIter first, Size count, T const &value)#

Copies the given value value to the first count elements in an uninitialized memory area beginning at first. If an exception is thrown during the initialization, the function has no effects.

Note

Complexity: Performs exactly count assignments, if count > 0, no assignments otherwise.

Template Parameters
  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • Size – The type of the argument specifying the number of elements to apply f to.

  • T – The type of the value to be assigned (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

  • value – The value to be assigned.

Returns

The uninitialized_fill_n algorithm returns a returns FwdIter. The uninitialized_fill_n algorithm returns the output iterator to the element in the range, one past the last element copied.

template<typename ExPolicy, typename FwdIter, typename Size, typename T>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter> uninitialized_fill_n(ExPolicy &&policy, FwdIter first, Size count, T const &value)#

Copies the given value value to the first count elements in an uninitialized memory area beginning at first. If an exception is thrown during the initialization, the function has no effects. Executed according to the policy.

The initializations in the parallel uninitialized_fill_n algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The initializations in the parallel uninitialized_fill_n algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly count assignments, if count > 0, no assignments otherwise.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • Size – The type of the argument specifying the number of elements to apply f to.

  • T – The type of the value to be assigned (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

  • value – The value to be assigned.

Returns

The uninitialized_fill_n algorithm returns a hpx::future<FwdIter>, if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The uninitialized_fill_n algorithm returns the output iterator to the element in the range, one past the last element copied.

hpx::uninitialized_move, hpx::uninitialized_move_n#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename InIter, typename FwdIter>
FwdIter uninitialized_move(InIter first, InIter last, FwdIter dest)#

Moves the elements in the range, defined by [first, last), to an uninitialized memory area beginning at dest. If an exception is thrown during the initialization, some objects in [first, last) are left in a valid but unspecified state.

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • FwdIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The uninitialized_move algorithm returns FwdIter. The uninitialized_move algorithm returns the output iterator to the element in the destination range, one past the last element moved.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> uninitialized_move(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest)#

Moves the elements in the range, defined by [first, last), to an uninitialized memory area beginning at dest. If an exception is thrown during the initialization, some objects in [first, last) are left in a valid but unspecified state. Executed according to the policy.

The assignments in the parallel uninitialized_move algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel uninitialized_move algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The uninitialized_move algorithm returns a hpx::future<FwdIter2>, if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter2 otherwise. The uninitialized_move algorithm returns the output iterator to the element in the destination range, one past the last element moved.

template<typename InIter, typename Size, typename FwdIter>
std::pair<InIter, FwdIter> uninitialized_move_n(InIter first, Size count, FwdIter dest)#

Moves the elements in the range [first, first + count), starting from first and proceeding to first + count - 1., to another range beginning at dest. If an exception is thrown during the initialization, some objects in [first, first + count) are left in a valid but unspecified state.

Note

Complexity: Performs exactly count movements, if count > 0, no move operations otherwise.

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • Size – The type of the argument specifying the number of elements to apply f to.

  • FwdIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The uninitialized_move_n algorithm returns a returns std::pair<InIter,FwdIter>. The uninitialized_move_n algorithm returns A pair whose first element is an iterator to the element past the last element moved in the source range, and whose second element is an iterator to the element past the last element moved in the destination range.

template<typename ExPolicy, typename FwdIter1, typename Size, typename FwdIter2>
parallel::util::detail::algorithm_result<ExPolicy, std::pair<FwdIter1, FwdIter2>>::type uninitialized_move_n(ExPolicy &&policy, FwdIter1 first, Size count, FwdIter2 dest)#

Moves the elements in the range [first, first + count), starting from first and proceeding to first + count - 1., to another range beginning at dest. If an exception is thrown during the initialization, some objects in [first, first + count) are left in a valid but unspecified state. Executed according to the policy.

The assignments in the parallel uninitialized_move_n algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel uninitialized_move_n algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly count movements, if count > 0, no move operations otherwise.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • Size – The type of the argument specifying the number of elements to apply f to.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The uninitialized_move_n algorithm returns a hpx::future<std::pair<FwdIter1,FwdIter2>> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns std::pair<FwdIter1,FwdIter2> otherwise. The uninitialized_move_n algorithm returns A pair whose first element is an iterator to the element past the last element moved in the source range, and whose second element is an iterator to the element past the last element moved in the destination range.

hpx::uninitialized_relocate, hpx::uninitialized_relocate_n#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename InIter1, typename InIter2, typename FwdIter>
FwdIter uninitialized_relocate(InIter1 first, InIter2 last, FwdIter dest)#

Relocates the elements in the range, defined by [first, last), to an uninitialized memory area beginning at dest. If an exception is thrown during the move-construction of an element, all elements left in the input range, as well as all objects already constructed in the destination range are destroyed. After this algorithm completes, the source range should be freed or reused without destroying the objects.

The assignments in the parallel uninitialized_relocate algorithm invoked without an execution policy object will execute in sequential order in the calling thread.

Note

Complexity: time: O(n), space: O(1) 1) For “trivially relocatable” underlying types (T) and a contiguous iterator range [first, last): std::distance(first, last)*sizeof(T) bytes are copied. 2) For “trivially relocatable” underlying types (T) and a non-contiguous iterator range [first, last): std::distance(first, last) memory copies of sizeof(T) bytes each are performed. 3) For “non-trivially relocatable” underlying types (T): std::distance(first, last) move assignments and destructions are performed.

Note

Declare a type as “trivially relocatable” using the HPX_DECLARE_TRIVIALLY_RELOCATABLE macros found in <hpx/type_support/is_trivially_relocatable.hpp>.

Template Parameters
  • InIter1 – The type of the source iterator first (deduced). This iterator type must meet the requirements of an input iterator.

  • InIter2 – The type of the source iterator last (deduced). This iterator type must meet the requirements of an input iterator.

  • FwdIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The uninitialized_relocate algorithm returns FwdIter. The uninitialized_relocate algorithm returns the output iterator to the element in the destination range, one past the last element relocated.

template<typename ExPolicy, typename InIter1, typename InIter2, typename FwdIter>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter> uninitialized_relocate(ExPolicy &&policy, InIter1 first, InIter2 last, FwdIter dest)#

Relocates the elements in the range defined by [first, last), to an uninitialized memory area beginning at dest. If an exception is thrown during the move-construction of an element, all elements left in the input range, as well as all objects already constructed in the destination range are destroyed. After this algorithm completes, the source range should be freed or reused without destroying the objects.

The assignments in the parallel uninitialized_relocate algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: time: O(n), space: O(1) 1) For “trivially relocatable” underlying types (T) and a contiguous iterator range [first, last): std::distance(first, last)*sizeof(T) bytes are copied. 2) For “trivially relocatable” underlying types (T) and a non-contiguous iterator range [first, last): std::distance(first, last) memory copies of sizeof(T) bytes each are performed. 3) For “non-trivially relocatable” underlying types (T): std::distance(first, last) move assignments and destructions are performed.

Note

Declare a type as “trivially relocatable” using the HPX_DECLARE_TRIVIALLY_RELOCATABLE macros found in <hpx/type_support/is_trivially_relocatable.hpp>.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • InIter1 – The type of the source iterator first (deduced). This iterator type must meet the requirements of an input iterator.

  • InIter2 – The type of the source iterator last (deduced). This iterator type must meet the requirements of an input iterator.

  • FwdIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range. The assignments in the parallel uninitialized_relocate_n algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

Returns

The uninitialized_relocate algorithm returns a hpx::future<FwdIter>, if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The uninitialized_relocate algorithm returns the output iterator to the element in the destination range, one past the last element relocated.

template<typename BiIter1, typename BiIter2>
BiIter2 uninitialized_relocate_backward(BiIter1 first, BiIter1 last, BiIter2 dest_last)#

Relocates the elements in the range, defined by [first, last), to an uninitialized memory area ending at dest_last. The objects are processed in reverse order. If an exception is thrown during the the move-construction of an element, all elements left in the input range, as well as all objects already constructed in the destination range are destroyed. After this algorithm completes, the source range should be freed or reused without destroying the objects.

The assignments in the parallel uninitialized_relocate algorithm invoked without an execution policy object will execute in sequential order in the calling thread.

Note

Complexity: time: O(n), space: O(1) 1) For “trivially relocatable” underlying types (T) and a contiguous iterator range [first, last): std::distance(first, last)*sizeof(T) bytes are copied. 2) For “trivially relocatable” underlying types (T) and a non-contiguous iterator range [first, last): std::distance(first, last) memory copies of sizeof(T) bytes each are performed. 3) For “non-trivially relocatable” underlying types (T): std::distance(first, last) move assignments and destructions are performed.

Note

Declare a type as “trivially relocatable” using the HPX_DECLARE_TRIVIALLY_RELOCATABLE macros found in <hpx/type_support/is_trivially_relocatable.hpp>.

Template Parameters
  • BiIter1 – The type of the source range (deduced). This iterator type must meet the requirements of a Bidirectional iterator.

  • BiIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a Bidirectional iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest_last – Refers to the beginning of the destination range.

Returns

The uninitialized_relocate_backward algorithm returns BiIter2. The uninitialized_relocate_backward algorithm returns the bidirectional iterator to the first element in the destination range.

template<typename ExPolicy, typename BiIter1, typename BiIter2>
hpx::parallel::util::detail::algorithm_result<ExPolicy, BiIter2> uninitialized_relocate_backward(ExPolicy &&policy, BiIter1 first, BiIter1 last, BiIter2 dest_last)#

Relocates the elements in the range, defined by [first, last), to an uninitialized memory area ending at dest_last. The order of the relocation of the objects depends on the execution policy. If an exception is thrown during the the move-construction of an element, all elements left in the input range, as well as all objects already constructed in the destination range are destroyed. After this algorithm completes, the source range should be freed or reused without destroying the objects.

The assignments in the parallel uninitialized_relocate_backward algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Using the uninitialized_relocate_backward algorithm with the with a non-sequenced execution policy, will not guarantee the order of the relocation of the objects.

Note

Complexity: time: O(n), space: O(1) 1) For “trivially relocatable” underlying types (T) and a contiguous iterator range [first, last): std::distance(first, last)*sizeof(T) bytes are copied. 2) For “trivially relocatable” underlying types (T) and a non-contiguous iterator range [first, last): std::distance(first, last) memory copies of sizeof(T) bytes each are performed. 3) For “non-trivially relocatable” underlying types (T): std::distance(first, last) move assignments and destructions are performed.

Note

Declare a type as “trivially relocatable” using the HPX_DECLARE_TRIVIALLY_RELOCATABLE macros found in <hpx/type_support/is_trivially_relocatable.hpp>.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • BiIter1 – The type of the source range (deduced). This iterator type must meet the requirements of a Bidirectional iterator.

  • BiIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a Bidirectional iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest_last – Refers to the end of the destination range.

Returns

The uninitialized_relocate_backward algorithm returns a hpx::future<FwdIter>, if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns BiIter2 otherwise. The uninitialized_relocate_backward algorithm returns the bidirectional iterator to the first element in the destination range.

template<typename InIter, typename Size, typename FwdIter>
FwdIter uninitialized_relocate_n(InIter first, Size count, FwdIter dest)#

Relocates the elements in the range, defined by [first, last), to an uninitialized memory area beginning at dest. If an exception is thrown during the move-construction of an element, all elements left in the input range, as well as all objects already constructed in the destination range are destroyed. After this algorithm completes, the source range should be freed or reused without destroying the objects.

The assignments in the parallel uninitialized_relocate_n algorithm invoked without an execution policy object will execute in sequential order in the calling thread.

Note

Complexity: time: O(n), space: O(1) 1) For “trivially relocatable” underlying types (T) and a contiguous iterator range [first, first+count): count*sizeof(T) bytes are copied. 2) For “trivially relocatable” underlying types (T) and a non-contiguous iterator range [first, first+count): count memory copies of sizeof(T) bytes each are performed. 3) For “non-trivially relocatable” underlying types (T): count move assignments and destructions are performed.

Note

Declare a type as “trivially relocatable” using the HPX_DECLARE_TRIVIALLY_RELOCATABLE macros found in <hpx/type_support/is_trivially_relocatable.hpp>.

Template Parameters
  • InIter – The type of the source iterator first (deduced). This iterator type must meet the requirements of an input iterator.

  • Size – The type of the argument specifying the number of elements to relocate.

  • FwdIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The uninitialized_relocate_n algorithm returns FwdIter. The uninitialized_relocate_n algorithm returns the output iterator to the element in the destination range, one past the last element relocated.

template<typename ExPolicy, typename InIter, typename Size, typename FwdIter>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter> uninitialized_relocate_n(ExPolicy &&policy, InIter first, Size count, FwdIter dest)#

Relocates the elements in the range, defined by [first, last), to an uninitialized memory area beginning at dest. If an exception is thrown during the move-construction of an element, all elements left in the input range, as well as all objects already constructed in the destination range are destroyed. After this algorithm completes, the source range should be freed or reused without destroying the objects.

The assignments in the parallel uninitialized_relocate_n algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel uninitialized_relocate_n algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: time: O(n), space: O(1) 1) For “trivially relocatable” underlying types (T) and a contiguous iterator range [first, first+count): count*sizeof(T) bytes are copied. 2) For “trivially relocatable” underlying types (T) and a non-contiguous iterator range [first, first+count): count memory copies of sizeof(T) bytes each are performed. 3) For “non-trivially relocatable” underlying types (T): count move assignments and destructions are performed.

Note

Declare a type as “trivially relocatable” using the HPX_DECLARE_TRIVIALLY_RELOCATABLE macros found in <hpx/type_support/is_trivially_relocatable.hpp>.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • InIter – The type of the source iterator first (deduced). This iterator type must meet the requirements of an input iterator.

  • Size – The type of the argument specifying the number of elements to relocate.

  • FwdIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The uninitialized_relocate_n algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The uninitialized_relocate_n algorithm returns the output iterator to the element in the destination range, one past the last element relocated.

hpx::uninitialized_value_construct, hpx::uninitialized_value_construct_n#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename FwdIter>
void uninitialized_value_construct(FwdIter first, FwdIter last)#

Constructs objects of type typename iterator_traits<ForwardIt> ::value_type in the uninitialized storage designated by the range by value-initialization. If an exception is thrown during the initialization, the function has no effects.

Note

Complexity: Performs exactly last - first assignments.

Template Parameters

FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

Returns

The uninitialized_value_construct algorithm returns nothing

template<typename ExPolicy, typename FwdIter>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy> uninitialized_value_construct(ExPolicy &&policy, FwdIter first, FwdIter last)#

Constructs objects of type typename iterator_traits<ForwardIt> ::value_type in the uninitialized storage designated by the range by value-initialization. If an exception is thrown during the initialization, the function has no effects. Executed according to the policy.

The assignments in the parallel uninitialized_value_construct algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel uninitialized_value_construct algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

Returns

The uninitialized_value_construct algorithm returns a hpx::future<void>, if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns nothing otherwise.

template<typename FwdIter, typename Size>
FwdIter uninitialized_value_construct_n(FwdIter first, Size count)#

Constructs objects of type typename iterator_traits<ForwardIt> ::value_type in the uninitialized storage designated by the range [first, first + count) by value-initialization. If an exception is thrown during the initialization, the function has no effects.

Note

Complexity: Performs exactly count assignments, if count > 0, no assignments otherwise.

Template Parameters
  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • Size – The type of the argument specifying the number of elements to apply f to.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

Returns

The uninitialized_value_construct_n algorithm returns a returns FwdIter. The uninitialized_value_construct_n algorithm returns the iterator to the element in the source range, one past the last element constructed.

template<typename ExPolicy, typename FwdIter, typename Size>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter> uninitialized_value_construct_n(ExPolicy &&policy, FwdIter first, Size count)#

Constructs objects of type typename iterator_traits<ForwardIt> ::value_type in the uninitialized storage designated by the range [first, first + count) by value-initialization. If an exception is thrown during the initialization, the function has no effects.

The assignments in the parallel uninitialized_value_construct_n algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel uninitialized_value_construct_n algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly count assignments, if count > 0, no assignments otherwise.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • Size – The type of the argument specifying the number of elements to apply f to.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

Returns

The uninitialized_value_construct_n algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The uninitialized_value_construct_n algorithm returns the iterator to the element in the source range, one past the last element constructed.

hpx::unique, hpx::unique_copy#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx

Functions

template<typename FwdIter, typename Pred = hpx::parallel::detail::equal_to, typename Proj = hpx::identity>
FwdIter unique(FwdIter first, FwdIter last, Pred &&pred = Pred(), Proj &&proj = Proj())#

Eliminates all but the first element from every consecutive group of equivalent elements from the range [first, last) and returns a past-the-end iterator for the new logical end of the range.

Note

Complexity: Performs not more than last - first assignments, exactly last - first - 1 applications of the predicate pred and no more than twice as many applications of the projection proj.

Template Parameters
  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • Pred – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of unique requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

  • Proj – The type of an optional projection function. This defaults to hpx::identity.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • pred – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). This is an binary predicate which returns true for the required elements. The signature of this predicate should be equivalent to:

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter can be dereferenced and then implicitly converted to both Type1 and Type2

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate pred is invoked.

Returns

The unique algorithm returns FwdIter. The unique algorithm returns the iterator to the new end of the range.

template<typename ExPolicy, typename FwdIter, typename Pred = hpx::parallel::detail::equal_to, typename Proj = hpx::identity>
parallel::util::detail::algorithm_result<ExPolicy, FwdIter>::type unique(ExPolicy &&policy, FwdIter first, FwdIter last, Pred &&pred = Pred(), Proj &&proj = Proj())#

Eliminates all but the first element from every consecutive group of equivalent elements from the range [first, last) and returns a past-the-end iterator for the new logical end of the range. Executed according to the policy.

The assignments in the parallel unique algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel unique algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs not more than last - first assignments, exactly last - first - 1 applications of the predicate pred and no more than twice as many applications of the projection proj.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • Pred – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of unique requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

  • Proj – The type of an optional projection function. This defaults to hpx::identity.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • pred – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). This is an binary predicate which returns true for the required elements. The signature of this predicate should be equivalent to:

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter can be dereferenced and then implicitly converted to both Type1 and Type2

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate pred is invoked.

Returns

The unique algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The unique algorithm returns the iterator to the new end of the range.

template<typename InIter, typename OutIter, typename Pred = hpx::parallel::detail::equal_to, typename Proj = hpx::identity>
OutIter unique_copy(InIter first, InIter last, OutIter dest, Pred &&pred = Pred(), Proj &&proj = Proj())#

Copies the elements from the range [first, last), to another range beginning at dest in such a way that there are no consecutive equal elements. Only the first element of each group of equal elements is copied.

Note

Complexity: Performs not more than last - first assignments, exactly last - first - 1 applications of the predicate pred and no more than twice as many applications of the projection proj

Template Parameters
  • InIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • OutIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an output iterator.

  • Pred – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of unique_copy requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • pred – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). This is an binary predicate which returns true for the required elements. The signature of this predicate should be equivalent to:

    bool pred(const Type &a, const Type &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate pred is invoked.

Returns

The unique_copy algorithm returns a returns OutIter. The unique_copy algorithm returns the destination iterator to the end of the dest range.

template<typename ExPolicy, typename FwdIter1, typename FwdIter2, typename Pred = hpx::parallel::detail::equal_to, typename Proj = hpx::identity>
parallel::util::detail::algorithm_result<ExPolicy, FwdIter2>::type unique_copy(ExPolicy &&policy, FwdIter1 first, FwdIter1 last, FwdIter2 dest, Pred &&pred = Pred(), Proj &&proj = Proj())#

Copies the elements from the range [first, last), to another range beginning at dest in such a way that there are no consecutive equal elements. Only the first element of each group of equal elements is copied. Executed according to the policy.

The assignments in the parallel unique_copy algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel unique_copy algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs not more than last - first assignments, exactly last - first - 1 applications of the predicate pred and no more than twice as many applications of the projection proj

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of a forward iterator.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of a forward iterator.

  • Pred – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of unique_copy requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

  • Proj – The type of an optional projection function. This defaults to hpx::identity.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • pred – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). This is an binary predicate which returns true for the required elements. The signature of this predicate should be equivalent to:

    bool pred(const Type &a, const Type &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter1 can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate pred is invoked.

Returns

The unique_copy algorithm returns a hpx::future<FwdIter2> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter2 otherwise. The unique_copy algorithm returns the pair of the source iterator to last, and the destination iterator to the end of the dest range.

hpx::ranges::adjacent_difference#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace ranges#

Functions

template<typename FwdIter1, typename FwdIter2, typename Sent>
FwdIter2 adjacent_difference(FwdIter1 first, Sent last, FwdIter2 dest)#

Searches the range [first, last) for two consecutive identical elements.

Note

Complexity: Exactly the smaller of (result - first) + 1 and (last - first) - 1 application of the predicate where result is the value returned

Template Parameters
  • FwdIter1 – The type of the source iterators used for the range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for InIter.

Parameters
  • first – Refers to the beginning of the sequence of elements of the range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the range the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The adjacent_difference algorithm returns an iterator to the first of the identical elements. If no such elements are found, last is returned.

template<typename Rng, typename FwdIter2>
FwdIter2 adjacent_difference(Rng &&rng, FwdIter2 dest)#

Searches the rng for two consecutive identical elements.

Note

Complexity: Exactly the smaller of (result - first) + 1 and (last - first) - 1 application of the predicate where result is the value returned

Template Parameters
  • FwdIter2 – The type of the source iterators used for the range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

Parameters
  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The adjacent_difference algorithm returns an iterator to the first of the identical elements.

template<typename ExPolicy, typename FwdIter1, typename Sent, typename FwdIter2>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> adjacent_difference(ExPolicy &&policy, FwdIter1 first, Sent last, FwdIter2 dest)#

Searches the range [first, last) for two consecutive identical elements.

Note

Complexity: Exactly the smaller of (result - first) + 1 and (last - first) - 1 application of the predicate where result is the value returned

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used for the range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for InIter.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of the range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the range the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The adjacent_difference algorithm returns an iterator to the first of the identical elements. If no such elements are found, last is returned.

template<typename ExPolicy, typename Rng, typename FwdIter2>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> adjacent_difference(ExPolicy &&policy, Rng &&rng, FwdIter2 dest)#

Searches the rng for two consecutive identical elements.

Note

Complexity: Exactly the smaller of (result - first) + 1 and (last - first) - 1 application of the predicate where result is the value returned

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter2 – The type of the source iterators used for the range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The adjacent_difference algorithm returns an iterator to the first of the identical elements.

template<typename FwdIter1, typename Sent, typename FwdIter2, typename Op>
FwdIter2 adjacent_difference(FwdIter1 first, Sent last, FwdIter2 dest, Op &&op)#

Searches the range [first, last) for two consecutive identical elements.

Note

Complexity: Exactly the smaller of (result - first) + 1 and (last - first) - 1 application of the predicate where result is the value returned

Template Parameters
  • FwdIter1 – The type of the source iterators used for the range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for InIter.

  • Op – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of adjacent_difference requires Op to meet the requirements of CopyConstructible.

Parameters
  • first – Refers to the beginning of the sequence of elements of the range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the range the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • op – Binary operation function object that will be applied. The signature of the function should be equivalent to the following:

    Ret fun(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &. The types Type1 and Type2 must be such that an object of type iterator_traits<InputIt>::value_type can be implicitly converted to both of them. The type Ret must be such that an object of type OutputIt can be dereferenced and assigned a value of type Ret.

Returns

The adjacent_difference algorithm returns an iterator to the first of the identical elements. If no such elements are found, last is returned.

template<typename Rng, typename FwdIter2, typename Op>
FwdIter2 adjacent_difference(Rng &&rng, FwdIter2 dest, Op &&op)#

Searches the rng for two consecutive identical elements.

Template Parameters
  • FwdIter2 – The type of the source iterators used for the range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • Op – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of adjacent_difference requires Op to meet the requirements of CopyConstructible.

Parameters
  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • op – Binary operation function object that will be applied. The signature of the function should be equivalent to the following:

    Ret fun(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &. The types Type1 and Type2 must be such that an object of type iterator_traits<InputIt>::value_type can be implicitly converted to both of them. The type Ret must be such that an object of type OutputIt can be dereferenced and assigned a value of type Ret.?

Returns

The adjacent_difference algorithm returns an iterator to the first of the identical elements.

template<typename ExPolicy, typename FwdIter1, typename Sent, typename FwdIter2, typename Op>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> adjacent_difference(ExPolicy &&policy, FwdIter1 first, Sent last, FwdIter2 dest, Op &&op)#

Searches the range [first, last) for two consecutive identical elements.

Note

Complexity: Exactly the smaller of (result - first) + 1 and (last - first) - 1 application of the predicate where result is the value returned

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used for the range (deduced). This iterator type must meet the requirements of an forward iterator.

  • FwdIter2 – The type of the source iterators used for the range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for InIter.

  • Op – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of adjacent_difference requires Op to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of the range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the range the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • op – Binary operation function object that will be applied. The signature of the function should be equivalent to the following:

    Ret fun(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &. The types Type1 and Type2 must be such that an object of type iterator_traits<InputIt>::value_type can be implicitly converted to both of them. The type Ret must be such that an object of type OutputIt can be dereferenced and assigned a value of type Ret.?

Returns

The adjacent_difference algorithm returns an iterator to the first of the identical elements. If no such elements are found, last is returned.

template<typename ExPolicy, typename Rng, typename FwdIter2, typename Op>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, FwdIter2> adjacent_difference(ExPolicy &&policy, Rng &&rng, FwdIter2 dest, Op &&op)#

Searches the rng for two consecutive identical elements.

Note

Complexity: Exactly the smaller of (result - first) + 1 and (last - first) - 1 application of the predicate where result is the value returned

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter2 – The type of the source iterators used for the range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • Op – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of adjacent_difference requires Op to meet the requirements of CopyConstructible.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • op – Binary operation function object that will be applied. The signature of the function should be equivalent to the following:

    Ret fun(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &. The types Type1 and Type2 must be such that an object of type iterator_traits<InputIt>::value_type can be implicitly converted to both of them. The type Ret must be such that an object of type OutputIt can be dereferenced and assigned a value of type Ret.

Returns

The adjacent_difference algorithm returns an iterator to the first of the identical elements.

hpx::ranges::adjacent_find#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace ranges

Functions

template<typename FwdIter, typename Sent, typename Proj = hpx::identity, typename Pred = detail::equal_to>
FwdIter adjacent_find(FwdIter first, Sent last, Pred &&pred = Pred(), Proj &&proj = Proj())#

Searches the range [first, last) for two consecutive identical elements.

Note

Complexity: Exactly the smaller of (result - first) + 1 and (last - first) - 1 application of the predicate where result is the value returned

Template Parameters
  • FwdIter – The type of the source iterators used for the range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for InIter.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

  • Pred – The type of an optional function/function object to use.

Parameters
  • first – Refers to the beginning of the sequence of elements of the range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the range the algorithm will be applied to.

  • pred – The binary predicate which returns true if the elements should be treated as equal. The signature should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1 .

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The adjacent_find algorithm returns an iterator to the first of the identical elements. If no such elements are found, last is returned.

template<typename ExPolicy, typename FwdIter, typename Sent, typename Proj = hpx::identity, typename Pred = detail::equal_to>
parallel::util::detail::algorithm_result<ExPolicy, FwdIter>::type adjacent_find(ExPolicy &&policy, FwdIter first, Sent last, Pred &&pred = Pred(), Proj &&proj = Proj())#

Searches the range [first, last) for two consecutive identical elements. This version uses the given binary predicate pred

The comparison operations in the parallel adjacent_find invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel adjacent_find invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

This overload of adjacent_find is available if the user decides to provide their algorithm their own binary predicate pred.

Note

Complexity: Exactly the smaller of (result - first) + 1 and (last - first) - 1 application of the predicate where result is the value returned

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used for the range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for InIter.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of adjacent_find requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of the range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the range the algorithm will be applied to.

  • pred – The binary predicate which returns true if the elements should be treated as equal. The signature should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1 .

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The adjacent_find algorithm returns a hpx::future<InIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns InIter otherwise. The adjacent_find algorithm returns an iterator to the first of the identical elements. If no such elements are found, last is returned.

template<typename Rng, typename Proj = hpx::identity, typename Pred = detail::equal_to>
hpx::traits::range_traits<Rng>::iterator_type adjacent_find(Rng &&rng, Pred &&pred = Pred(), Proj &&proj = Proj())#

Searches the range rng for two consecutive identical elements.

Note

Complexity: Exactly the smaller of (result - std::begin(rng)) + 1 and (std::begin(rng) - std::end(rng)) - 1 applications of the predicate where result is the value returned

Template Parameters
  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an forward iterator.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

  • Pred – The type of an optional function/function object to use.

Parameters
  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • pred – The binary predicate which returns true if the elements should be treated as equal. The signature should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1 .

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The adjacent_find algorithm returns an iterator to the first of the identical elements. If no such elements are found, last is returned.

template<typename ExPolicy, typename Rng, typename Proj = hpx::identity, typename Pred = detail::equal_to>
parallel::util::detail::algorithm_result<ExPolicy, typename hpx::traits::range_traits<Rng>::iterator_type>::type adjacent_find(ExPolicy &&policy, Rng &&rng, Pred &&pred = Pred(), Proj &&proj = Proj())#

Searches the range rng for two consecutive identical elements.

The comparison operations in the parallel adjacent_find invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel adjacent_find invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

This overload of adjacent_find is available if the user decides to provide their algorithm their own binary predicate pred.

Note

Complexity: Exactly the smaller of (result - std::begin(rng)) + 1 and (std::begin(rng) - std::end(rng)) - 1 applications of the predicate where result is the value returned

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an forward iterator.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of adjacent_find requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • pred – The binary predicate which returns true if the elements should be treated as equal. The signature should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1 .

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The adjacent_find algorithm returns a hpx::future<InIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns InIter otherwise. The adjacent_find algorithm returns an iterator to the first of the identical elements. If no such elements are found, last is returned.

hpx::ranges::all_of, hpx::ranges::any_of, hpx::ranges::none_of#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace ranges

Functions

template<typename ExPolicy, typename Rng, typename F, typename Proj = hpx::identity>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, bool> none_of(ExPolicy &&policy, Rng &&rng, F &&f, Proj &&proj = Proj())#

Checks if unary predicate f returns true for no elements in the range rng.

The application of function objects in parallel algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most std::distance(begin(rng), end(rng)) applications of the predicate f

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of none_of requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The none_of algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The none_of algorithm returns true if the unary predicate f returns true for no elements in the range, false otherwise. It returns true if the range is empty.

template<typename ExPolicy, typename Iter, typename Sent, typename F, typename Proj = hpx::identity>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, bool> none_of(ExPolicy &&policy, Iter first, Sent last, F &&f, Proj &&proj = Proj())#

Checks if unary predicate f returns true for no elements in the range [first, last).

The application of function objects in parallel algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most last - first applications of the predicate f

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • Iter – The type of the source iterators used for the range (deduced).

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for InIter.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of none_of requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of the range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the range the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The none_of algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The none_of algorithm returns true if the unary predicate f returns true for no elements in the range, false otherwise. It returns true if the range is empty.

template<typename Rng, typename F, typename Proj = hpx::identity>
bool none_of(Rng &&rng, F &&f, Proj &&proj = Proj())#

Checks if unary predicate f returns true for no elements in the range rng.

Note

Complexity: At most std::distance(begin(rng), end(rng)) applications of the predicate f

Template Parameters
  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of none_of requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The none_of algorithm returns true if the unary predicate f returns true for no elements in the range, false otherwise. It returns true if the range is empty.

template<typename Iter, typename Sent, typename F, typename Proj = hpx::identity>
bool none_of(Iter first, Sent last, F &&f, Proj &&proj = Proj())#

Checks if unary predicate f returns true for no elements in the range [first, last).

Note

Complexity: At most last - first applications of the predicate f

Template Parameters
  • Iter – The type of the source iterators used for the range (deduced).

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for InIter.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of none_of requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • first – Refers to the beginning of the sequence of elements of the range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the range the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The none_of algorithm returns true if the unary predicate f returns true for no elements in the range, false otherwise. It returns true if the range is empty.

template<typename ExPolicy, typename Rng, typename F, typename Proj = hpx::identity>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, bool> any_of(ExPolicy &&policy, Rng &&rng, F &&f, Proj &&proj = Proj())#

Checks if unary predicate f returns true for at least one element in the range rng.

The application of function objects in parallel algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most std::distance(begin(rng), end(rng)) applications of the predicate f

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of none_of requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The any_of algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The any_of algorithm returns true if the unary predicate f returns true for at least one element in the range, false otherwise. It returns false if the range is empty.

template<typename ExPolicy, typename Iter, typename Sent, typename F, typename Proj = hpx::identity>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, bool> any_of(ExPolicy &&policy, Iter first, Sent last, F &&f, Proj &&proj = Proj())#

Checks if unary predicate f returns true for at least one element in the range rng.

The application of function objects in parallel algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most std::distance(begin(rng), end(rng)) applications of the predicate f

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • Iter – The type of the source iterators used for the range (deduced).

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for InIter.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of none_of requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of the range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the range the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The any_of algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The any_of algorithm returns true if the unary predicate f returns true for at least one element in the range, false otherwise. It returns false if the range is empty.

template<typename Rng, typename F, typename Proj = hpx::identity>
bool any_of(Rng &&rng, F &&f, Proj &&proj = Proj())#

Checks if unary predicate f returns true for at least one element in the range rng.

Note

Complexity: At most std::distance(begin(rng), end(rng)) applications of the predicate f

Template Parameters
  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of none_of requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The any_of algorithm returns true if the unary predicate f returns true for at least one element in the range, false otherwise. It returns false if the range is empty.

template<typename Iter, typename Sent, typename F, typename Proj = hpx::identity>
bool any_of(Iter first, Sent last, F &&f, Proj &&proj = Proj())#

Checks if unary predicate f returns true for at least one element in the range rng.

Note

Complexity: At most std::distance(begin(rng), end(rng)) applications of the predicate f

Template Parameters
  • Iter – The type of the source iterators used for the range (deduced).

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for InIter.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of none_of requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • first – Refers to the beginning of the sequence of elements of the range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the range the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The any_of algorithm returns true if the unary predicate f returns true for at least one element in the range, false otherwise. It returns false if the range is empty.

template<typename ExPolicy, typename Rng, typename F, typename Proj = hpx::identity>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, bool> all_of(ExPolicy &&policy, Rng &&rng, F &&f, Proj &&proj = Proj())#

Checks if unary predicate f returns true for all elements in the range rng.

The application of function objects in parallel algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most std::distance(begin(rng), end(rng)) applications of the predicate f

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of none_of requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The all_of algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The all_of algorithm returns true if the unary predicate f returns true for all elements in the range, false otherwise. It returns true if the range is empty.

template<typename ExPolicy, typename Iter, typename Sent, typename F, typename Proj = hpx::identity>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, bool> all_of(ExPolicy &&policy, Iter first, Sent last, F &&f, Proj &&proj = Proj())#

Checks if unary predicate f returns true for all elements in the range rng.

The application of function objects in parallel algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The application of function objects in parallel algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most std::distance(begin(rng), end(rng)) applications of the predicate f

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it applies user-provided function objects.

  • Iter – The type of the source iterators used for the range (deduced).

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for InIter.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of none_of requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of the range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the range the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The all_of algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The all_of algorithm returns true if the unary predicate f returns true for all elements in the range, false otherwise. It returns true if the range is empty.

template<typename Rng, typename F, typename Proj = hpx::identity>
bool all_of(Rng &&rng, F &&f, Proj &&proj = Proj())#

Checks if unary predicate f returns true for all elements in the range rng.

Note

Complexity: At most std::distance(begin(rng), end(rng)) applications of the predicate f

Template Parameters
  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of none_of requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The all_of algorithm returns true if the unary predicate f returns true for all elements in the range, false otherwise. It returns true if the range is empty.

template<typename Iter, typename Sent, typename F, typename Proj = hpx::identity>
bool all_of(Iter first, Sent last, F &&f, Proj &&proj = Proj())#

Checks if unary predicate f returns true for all elements in the range rng.

Note

Complexity: At most std::distance(begin(rng), end(rng)) applications of the predicate f

Template Parameters
  • Iter – The type of the source iterators used for the range (deduced).

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for InIter.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of none_of requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • first – Refers to the beginning of the sequence of elements of the range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the range the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last). The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The all_of algorithm returns true if the unary predicate f returns true for all elements in the range, false otherwise. It returns true if the range is empty.

hpx::ranges::copy, hpx::ranges::copy_n, hpx::ranges::copy_if#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace ranges

Functions

template<typename ExPolicy, typename FwdIter1, typename Sent1, typename FwdIter>
parallel::util::detail::algorithm_result<ExPolicy, ranges::copy_result<FwdIter1, FwdIter>>::type copy(ExPolicy &&policy, FwdIter1 iter, Sent1 sent, FwdIter dest)#

Copies the elements in the range, defined by [first, last), to another range beginning at dest.

The assignments in the parallel copy algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel copy algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the begin source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Sent1 – The type of the end source iterators used (deduced). This iterator type must meet the requirements of an sentinel for Iter1.

  • FwdIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • iter – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • sent – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The copy algorithm returns a hpx::future<ranges::copy_result<FwdIter1, FwdIter> > if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns ranges::copy_result<FwdIter1, FwdIter> otherwise. The copy algorithm returns the pair of the input iterator last and the output iterator to the element in the destination range, one past the last element copied.

template<typename ExPolicy, typename Rng, typename FwdIter>
parallel::util::detail::algorithm_result<ExPolicy, ranges::copy_result<typename hpx::traits::range_traits<Rng>::iterator_type, FwdIter>>::type copy(ExPolicy &&policy, Rng &&rng, FwdIter dest)#

Copies the elements in the range rng to another range beginning at dest.

The assignments in the parallel copy algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel copy algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly std::distance(begin(rng), end(rng)) assignments.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • FwdIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The copy algorithm returns a hpx::future<ranges::copy_result<iterator_t<Rng>, FwdIter2>> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns ranges::copy_result<iterator_t<Rng>, FwdIter2> otherwise. The copy algorithm returns the pair of the input iterator last and the output iterator to the element in the destination range, one past the last element copied.

template<typename FwdIter1, typename Sent1, typename FwdIter>
ranges::copy_result<FwdIter1, FwdIter> copy(FwdIter1 iter, Sent1 sent, FwdIter dest)#

Copies the elements in the range, defined by [first, last), to another range beginning at dest.

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • FwdIter1 – The type of the begin source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Sent1 – The type of the end source iterators used (deduced). This iterator type must meet the requirements of an sentinel for Iter1.

  • FwdIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • iter – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • sent – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The copy algorithm returns the pair of the input iterator last and the output iterator to the element in the destination range, one past the last element copied.

template<typename Rng, typename FwdIter>
ranges::copy_result<typename hpx::traits::range_traits<Rng>::iterator_type, FwdIter> copy(Rng &&rng, FwdIter dest)#

Copies the elements in the range rng to another range beginning at dest.

Note

Complexity: Performs exactly std::distance(begin(rng), end(rng)) assignments.

Template Parameters
  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • FwdIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The copy algorithm returns the pair of the input iterator last and the output iterator to the element in the destination range, one past the last element copied.

template<typename ExPolicy, typename FwdIter1, typename Size, typename FwdIter2>
hpx::parallel::util::detail::algorithm_result<ExPolicy, ranges::copy_n_result<FwdIter1, FwdIter2>>::type copy_n(ExPolicy &&policy, FwdIter1 first, Size count, FwdIter2 dest)#

Copies the elements in the range [first, first + count), starting from first and proceeding to first + count - 1., to another range beginning at dest.

The assignments in the parallel copy_n algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel copy_n algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly count assignments, if count > 0, no assignments otherwise.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Size – The type of the argument specifying the number of elements to apply f to.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The copy_n algorithm returns a hpx::future<ranges::copy_n_result<FwdIter1, FwdIter2> > if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns ranges::copy_n_result<FwdIter1, FwdIter2> otherwise. The copy algorithm returns the pair of the input iterator forwarded to the first element after the last in the input sequence and the output iterator to the element in the destination range, one past the last element copied.

template<typename FwdIter1, typename Size, typename FwdIter2>
ranges::copy_n_result<FwdIter1, FwdIter2> copy_n(FwdIter1 first, Size count, FwdIter2 dest)#

Copies the elements in the range [first, first + count), starting from first and proceeding to first + count - 1., to another range beginning at dest.

Note

Complexity: Performs exactly count assignments, if count > 0, no assignments otherwise.

Template Parameters
  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Size – The type of the argument specifying the number of elements to apply f to.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

Returns

The copy algorithm returns the pair of the input iterator forwarded to the first element after the last in the input sequence and the output iterator to the element in the destination range, one past the last element copied.

template<typename ExPolicy, typename FwdIter1, typename Sent1, typename FwdIter, typename Pred, typename Proj = hpx::identity>
hpx::parallel::util::detail::algorithm_result<ExPolicy, ranges::copy_if_result<FwdIter1, FwdIter>>::type copy_if(ExPolicy &&policy, FwdIter1 iter, Sent1 sent, FwdIter dest, Pred &&pred, Proj &&proj = Proj())#

Copies the elements in the range, defined by [first, last) to another range beginning at dest. The order of the elements that are not removed is preserved.

The assignments in the parallel copy_if algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel copy_if algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the begin source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Sent1 – The type of the end source iterators used (deduced). This iterator type must meet the requirements of an sentinel for FwdIter1.

  • FwdIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an output iterator.

  • Pred – The type of an optional function/function object to use.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • iter – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • sent – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • pred – The binary predicate which returns true if the elements should be treated as equal. The signature should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1 .

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The copy_if algorithm returns a hpx::future<ranges::copy_if_result<iterator_t<Rng>, FwdIter2>> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns ranges::copy_if_result<iterator_t<Rng>, FwdIter2> otherwise. The copy_if algorithm returns the pair of the input iterator last and the output iterator to the element in the destination range, one past the last element copied.

template<typename ExPolicy, typename Rng, typename FwdIter, typename Pred, typename Proj = hpx::identity>
hpx::parallel::util::detail::algorithm_result<ExPolicy, ranges::copy_if_result<typename hpx::traits::range_traits<Rng>::iterator_type, FwdIter>>::type copy_if(ExPolicy &&policy, Rng &&rng, FwdIter dest, Pred &&pred, Proj &&proj = Proj())#

Copies the elements in the range, defined by rng to another range beginning at dest. The order of the elements that are not removed is preserved.

The assignments in the parallel copy_if algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel copy_if algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an forward iterator.

  • FwdIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an output iterator.

  • Pred – The type of an optional function/function object to use.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • pred – The binary predicate which returns true if the elements should be treated as equal. The signature should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1 .

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The copy_if algorithm returns a hpx::future<ranges::copy_if_result<iterator_t<Rng>, FwdIter2>> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns ranges::copy_if_result<iterator_t<Rng>, FwdIter2> otherwise. The copy_if algorithm returns the pair of the input iterator last and the output iterator to the element in the destination range, one past the last element copied.

template<typename FwdIter1, typename Sent1, typename FwdIter, typename Pred, typename Proj = hpx::identity>
ranges::copy_if_result<FwdIter1, FwdIter> copy_if(FwdIter1 iter, Sent1 sent, FwdIter dest, Pred &&pred, Proj &&proj = Proj())#

Copies the elements in the range, defined by [first, last) to another range beginning at dest. The order of the elements that are not removed is preserved.

Template Parameters
  • FwdIter1 – The type of the begin source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Sent1 – The type of the end source iterators used (deduced). This iterator type must meet the requirements of an sentinel for FwdIter1.

  • FwdIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an output iterator.

  • Pred – The type of an optional function/function object to use.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • iter – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • sent – Refers to the end of the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • pred – The binary predicate which returns true if the elements should be treated as equal. The signature should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1 .

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The copy_if algorithm returns the pair of the input iterator last and the output iterator to the element in the destination range, one past the last element copied.

template<typename Rng, typename FwdIter, typename Pred, typename Proj = hpx::identity>
ranges::copy_if_result<typename hpx::traits::range_traits<Rng>::iterator_type, FwdIter> copy_if(Rng &&rng, FwdIter dest, Pred &&pred, Proj &&proj = Proj())#

Copies the elements in the range, defined by rng to another range beginning at dest. The order of the elements that are not removed is preserved.

Template Parameters
  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an forward iterator.

  • FwdIter – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an output iterator.

  • Pred – The type of an optional function/function object to use.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • pred – The binary predicate which returns true if the elements should be treated as equal. The signature should be equivalent to the following:

    bool pred(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type1 .

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The copy_if algorithm returns the pair of the input iterator last and the output iterator to the element in the destination range, one past the last element copied.

hpx::ranges::count, hpx::ranges::count_if#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace ranges

Functions

template<typename ExPolicy, typename Rng, typename Proj = hpx::identity, typename T = typename hpx::parallel::traits::projected<hpx::traits::range_iterator_t<Rng>, Proj>::value_type>
hpx::parallel::util::detail::algorithm_result<ExPolicy, typename std::iterator_traits<typename hpx::traits::range_traits<Rng>::iterator_type>::difference_type>::type count(ExPolicy &&policy, Rng &&rng, T const &value, Proj &&proj = Proj())#

Returns the number of elements in the range [first, last) satisfying a specific criteria. This version counts the elements that are equal to the given value.

The comparisons in the parallel count algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

Note

Complexity: Performs exactly last - first comparisons.

Note

The comparisons in the parallel count algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the comparisons.

  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • T – The type of the value to search for (deduced).

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • value – The value to search for.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The count algorithm returns a hpx::future<difference_type> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns difference_type otherwise (where difference_type is defined by std::iterator_traits<FwdIter>::difference_type. The count algorithm returns the number of elements satisfying the given criteria.

template<typename ExPolicy, typename Iter, typename Sent, typename Proj = hpx::identity, typename T = typename hpx::parallel::traits::projected<Iter, Proj>::value_type>
hpx::parallel::util::detail::algorithm_result<ExPolicy, typename std::iterator_traits<Iter>::difference_type>::type count(ExPolicy &&policy, Iter first, Sent last, T const &value, Proj &&proj = Proj())#

Returns the number of elements in the range [first, last) satisfying a specific criteria. This version counts the elements that are equal to the given value.

The comparisons in the parallel count algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

Note

Complexity: Performs exactly last - first comparisons.

Note

The comparisons in the parallel count algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the comparisons.

  • Iter – The type of the source iterators used for the range (deduced).

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for InIter.

  • T – The type of the value to search for (deduced).

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • value – The value to search for.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The count algorithm returns a hpx::future<difference_type> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns difference_type otherwise (where difference_type is defined by std::iterator_traits<FwdIter>::difference_type. The count algorithm returns the number of elements satisfying the given criteria.

template<typename Rng, typename Proj = hpx::identity, typename T = typename hpx::parallel::traits::projected<hpx::traits::range_iterator_t<Rng>, Proj>::value_type>
std::iterator_traits<typename hpx::traits::range_traits<Rng>::iterator_type>::difference_type count(Rng &&rng, T const &value, Proj &&proj = Proj())#

Returns the number of elements in the range [first, last) satisfying a specific criteria. This version counts the elements that are equal to the given value.

Note

Complexity: Performs exactly last - first comparisons.

Template Parameters
  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • T – The type of the value to search for (deduced).

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • value – The value to search for.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The count algorithm returns the number of elements satisfying the given criteria.

template<typename Iter, typename Sent, typename Proj = hpx::identity, typename T = typename hpx::parallel::traits::projected<Iter, Proj>::value_type>
std::iterator_traits<Iter>::difference_type count(Iter first, Sent last, T const &value, Proj &&proj = Proj())#

Returns the number of elements in the range [first, last) satisfying a specific criteria. This version counts the elements that are equal to the given value.

Note

Complexity: Performs exactly last - first comparisons.

Template Parameters
  • Iter – The type of the source iterators used for the range (deduced).

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for InIter.

  • T – The type of the value to search for (deduced).

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • value – The value to search for.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The count algorithm returns the number of elements satisfying the given criteria.

template<typename ExPolicy, typename Rng, typename F, typename Proj = hpx::identity>
hpx::parallel::util::detail::algorithm_result<ExPolicy, typename std::iterator_traits<typename hpx::traits::range_traits<Rng>::iterator_type>::difference_type>::type count_if(ExPolicy &&policy, Rng &&rng, F &&f, Proj &&proj = Proj())#

Returns the number of elements in the range [first, last) satisfying a specific criteria. This version counts elements for which predicate f returns true.

Note

Complexity: Performs exactly last - first applications of the predicate.

Note

The assignments in the parallel count_if algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

Note

The assignments in the parallel count_if algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the comparisons.

  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of count_if requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last).This is an unary predicate which returns true for the required elements. The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The count_if algorithm returns hpx::future<difference_type> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns difference_type otherwise (where difference_type is defined by std::iterator_traits<FwdIter>::difference_type. The count algorithm returns the number of elements satisfying the given criteria.

template<typename ExPolicy, typename Iter, typename Sent, typename F, typename Proj = hpx::identity>
hpx::parallel::util::detail::algorithm_result<ExPolicy, typename std::iterator_traits<Iter>::difference_type>::type count_if(ExPolicy &&policy, Iter first, Sent last, F &&f, Proj &&proj = Proj())#

Returns the number of elements in the range [first, last) satisfying a specific criteria. This version counts elements for which predicate f returns true.

Note

Complexity: Performs exactly last - first applications of the predicate.

Note

The assignments in the parallel count_if algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

Note

The assignments in the parallel count_if algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the comparisons.

  • Iter – The type of the source iterators used for the range (deduced).

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for InIter.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of count_if requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last).This is an unary predicate which returns true for the required elements. The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The count_if algorithm returns hpx::future<difference_type> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns difference_type otherwise (where difference_type is defined by std::iterator_traits<FwdIter>::difference_type. The count algorithm returns the number of elements satisfying the given criteria.

template<typename Rng, typename F, typename Proj = hpx::identity>
std::iterator_traits<typename hpx::traits::range_traits<Rng>::iterator_type>::difference_type count_if(Rng &&rng, F &&f, Proj &&proj = Proj())#

Returns the number of elements in the range [first, last) satisfying a specific criteria. This version counts elements for which predicate f returns true.

Note

Complexity: Performs exactly last - first applications of the predicate.

Template Parameters
  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of count_if requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last).This is an unary predicate which returns true for the required elements. The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The count algorithm returns the number of elements satisfying the given criteria.

template<typename Iter, typename Sent, typename F, typename Proj = hpx::identity>
std::iterator_traits<Iter>::difference_type count_if(Iter first, Sent last, F &&f, Proj &&proj = Proj())#

Returns the number of elements in the range [first, last) satisfying a specific criteria. This version counts elements for which predicate f returns true.

Note

Complexity: Performs exactly last - first applications of the predicate.

Template Parameters
  • Iter – The type of the source iterators used for the range (deduced).

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for InIter.

  • F – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of count_if requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

  • f – Specifies the function (or function object) which will be invoked for each of the elements in the sequence specified by [first, last).This is an unary predicate which returns true for the required elements. The signature of this predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The type Type must be such that an object of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The count algorithm returns the number of elements satisfying the given criteria.

hpx::ranges::destroy, hpx::ranges::destroy_n#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace ranges

Functions

template<typename ExPolicy, typename Rng>
hpx::parallel::util::detail::algorithm_result<ExPolicy, hpx::traits::range_iterator_t<Rng>> destroy(ExPolicy &&policy, Rng &&rng)#

Destroys objects of type typename iterator_traits<ForwardIt>::value_type in the range [first, last).

The operations in the parallel destroy algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The operations in the parallel destroy algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly last - first operations.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • rng – Refers to the sequence of elements the algorithm will be applied to.

Returns

The destroy algorithm returns a hpx::future<void>, if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns void otherwise.

template<typename ExPolicy, typename Iter, typename Sent>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, Iter> destroy(ExPolicy &&policy, Iter first, Sent last)#

Destroys objects of type typename iterator_traits<ForwardIt>::value_type in the range [first, last).

The operations in the parallel destroy algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The operations in the parallel destroy algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly last - first operations.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • Iter – The type of the source iterators used for the range (deduced).

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for InIter.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

Returns

The destroy algorithm returns a hpx::future<void>, if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns void otherwise.

template<typename Rng>
hpx::traits::range_iterator<Rng>::type destroy(Rng &&rng)#

Destroys objects of type typename iterator_traits<ForwardIt>::value_type in the range [first, last).

Note

Complexity: Performs exactly last - first operations.

Template Parameters

Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

Parameters

rng – Refers to the sequence of elements the algorithm will be applied to.

Returns

The destroy algorithm returns void.

template<typename Iter, typename Sent>
Iter destroy(Iter first, Sent last)#

Destroys objects of type typename iterator_traits<ForwardIt>::value_type in the range [first, last).

Note

Complexity: Performs exactly last - first operations.

Template Parameters
  • Iter – The type of the source iterators used for the range (deduced).

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for InIter.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements the algorithm will be applied to.

Returns

The destroy algorithm returns void.

template<typename ExPolicy, typename FwdIter, typename Size>
hpx::parallel::util::detail::algorithm_result<ExPolicy, FwdIter>::type destroy_n(ExPolicy &&policy, FwdIter first, Size count)#

Destroys objects of type typename iterator_traits<ForwardIt>::value_type in the range [first, first + count).

The operations in the parallel destroy_n algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The operations in the parallel destroy_n algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly count operations, if count > 0, no assignments otherwise.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Size – The type of the argument specifying the number of elements to apply this algorithm to.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

Returns

The destroy_n algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The destroy_n algorithm returns the iterator to the element in the source range, one past the last element constructed.

template<typename FwdIter, typename Size>
FwdIter destroy_n(FwdIter first, Size count)#

Destroys objects of type typename iterator_traits<ForwardIt>::value_type in the range [first, first + count).

Note

Complexity: Performs exactly count operations, if count > 0, no assignments otherwise.

Template Parameters
  • FwdIter – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Size – The type of the argument specifying the number of elements to apply this algorithm to.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

Returns

The destroy_n algorithm returns the iterator to the element in the source range, one past the last element constructed.

hpx::ranges::ends_with#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace ranges

Functions

template<typename Iter1, typename Sent1, typename Iter2, typename Sent2, typename Pred = ranges::equal_to, typename Proj1 = hpx::identity, typename Proj2 = hpx::identity>
bool ends_with(Iter1 first1, Sent1 last1, Iter2 first2, Sent2 last2, Pred &&pred = Pred(), Proj1 &&proj1 = Proj1(), Proj2 &&proj2 = Proj2())#

Checks whether the second range defined by [first1, last1) matches the suffix of the first range defined by [first2, last2)

The assignments in the parallel ends_with algorithm invoked without an execution policy object execute in sequential order in the calling thread.

Note

Complexity: Linear: at most min(N1, N2) applications of the predicate and both projections.

Template Parameters
  • Iter1 – The type of the begin source iterators used (deduced). This iterator type must meet the requirements of an input iterator.

  • Sent1 – The type of the end source iterators used(deduced). This iterator type must meet the requirements of an sentinel for Iter1.

  • Iter2 – The type of the begin destination iterators used deduced). This iterator type must meet the requirements of a input iterator.

  • Sent2 – The type of the end destination iterators used (deduced). This iterator type must meet the requirements of an sentinel for Iter2.

  • Pred – The binary predicate that compares the projected elements.

  • Proj1 – The type of an optional projection function for the source range. This defaults to hpx::identity

  • Proj2 – The type of an optional projection function for the destination range. This defaults to hpx::identity

Parameters
  • first1 – Refers to the beginning of the source range.

  • last1 – Sentinel value referring to the end of the source range.

  • first2 – Refers to the beginning of the destination range.

  • last2 – Sentinel value referring to the end of the destination range.

  • pred – Specifies the binary predicate function (or function object) which will be invoked for comparison of the elements in the in two ranges projected by proj1 and proj2 respectively.

  • proj1 – Specifies the function (or function object) which will be invoked for each of the elements in the source range as a projection operation before the actual predicate is invoked.

  • proj2 – Specifies the function (or function object) which will be invoked for each of the elements in the destination range as a projection operation before the actual predicate is invoked.

Returns

The ends_with algorithm returns bool. The ends_with algorithm returns a boolean with the value true if the second range matches the suffix of the first range, false otherwise.

template<typename ExPolicy, typename FwdIter1, typename Sent1, typename FwdIter2, typename Sent2, typename Pred = ranges::equal_to, typename Proj1 = hpx::identity, typename Proj2 = hpx::identity>
parallel::util::detail::algorithm_result<ExPolicy, bool>::type ends_with(ExPolicy &&policy, FwdIter1 first1, Sent1 last1, FwdIter2 first2, Sent2 last2, Pred &&pred = Pred(), Proj1 &&proj1 = Proj1(), Proj2 &&proj2 = Proj2())#

Checks whether the second range defined by [first1, last1) matches the suffix of the first range defined by [first2, last2)

The assignments in the parallel ends_with algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel ends_with algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Linear: at most min(N1, N2) applications of the predicate and both projections.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the begin source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Sent1 – The type of the end source iterators used(deduced). This iterator type must meet the requirements of an sentinel for Iter1.

  • FwdIter2 – The type of the begin destination iterators used deduced). This iterator type must meet the requirements of a forward iterator.

  • Sent2 – The type of the end destination iterators used (deduced). This iterator type must meet the requirements of an sentinel for Iter2.

  • Pred – The binary predicate that compares the projected elements.

  • Proj1 – The type of an optional projection function for the source range. This defaults to hpx::identity

  • Proj2 – The type of an optional projection function for the destination range. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the source range.

  • last1 – Sentinel value referring to the end of the source range.

  • first2 – Refers to the beginning of the destination range.

  • last2 – Sentinel value referring to the end of the destination range.

  • pred – Specifies the binary predicate function (or function object) which will be invoked for comparison of the elements in the in two ranges projected by proj1 and proj2 respectively.

  • proj1 – Specifies the function (or function object) which will be invoked for each of the elements in the source range as a projection operation before the actual predicate is invoked.

  • proj2 – Specifies the function (or function object) which will be invoked for each of the elements in the destination range as a projection operation before the actual predicate is invoked.

Returns

The ends_with algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The ends_with algorithm returns a boolean with the value true if the second range matches the suffix of the first range, false otherwise.

template<typename Rng1, typename Rng2, typename Pred = ranges::equal_to, typename Proj1 = hpx::identity, typename Proj2 = hpx::identity>
bool ends_with(Rng1 &&rng1, Rng2 &&rng2, Pred &&pred = Pred(), Proj1 &&proj1 = Proj1(), Proj2 &&proj2 = Proj2())#

Checks whether the second range rng2 matches the suffix of the first range rng1.

The assignments in the parallel ends_with algorithm invoked without an execution policy object execute in sequential order in the calling thread.

Note

Complexity: Linear: at most min(N1, N2) applications of the predicate and both projections.

Template Parameters
  • Rng1 – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an forward iterator.

  • Rng2 – The type of the destination range used (deduced). The iterators extracted from this range type must meet the requirements of an forward iterator.

  • Pred – The binary predicate that compares the projected elements.

  • Proj1 – The type of an optional projection function for the source range. This defaults to hpx::identity

  • Proj2 – The type of an optional projection function for the destination range. This defaults to hpx::identity

Parameters
  • rng1 – Refers to the source range.

  • rng2 – Refers to the destination range.

  • pred – Specifies the binary predicate function (or function object) which will be invoked for comparison of the elements in the in two ranges projected by proj1 and proj2 respectively.

  • proj1 – Specifies the function (or function object) which will be invoked for each of the elements in the source range as a projection operation before the actual predicate is invoked.

  • proj2 – Specifies the function (or function object) which will be invoked for each of the elements in the destination range as a projection operation before the actual predicate is invoked.

Returns

The ends_with algorithm returns bool. The ends_with algorithm returns a boolean with the value true if the second range matches the suffix of the first range, false otherwise.

template<typename ExPolicy, typename Rng1, typename Rng2, typename Pred = ranges::equal_to, typename Proj1 = hpx::identity, typename Proj2 = hpx::identity>
hpx::parallel::util::detail::algorithm_result<ExPolicy, bool>::type ends_with(ExPolicy &&policy, Rng1 &&rng1, Rng2 &&rng2, Pred &&pred = Pred(), Proj1 &&proj1 = Proj1(), Proj2 &&proj2 = Proj2())#

Checks whether the second range rng2 matches the suffix of the first range rng1.

The assignments in the parallel ends_with algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The assignments in the parallel ends_with algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Linear: at most min(N1, N2) applications of the predicate and both projections.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • Rng1 – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an forward iterator.

  • Rng2 – The type of the destination range used (deduced). The iterators extracted from this range type must meet the requirements of an forward iterator.

  • Pred – The binary predicate that compares the projected elements.

  • Proj1 – The type of an optional projection function for the source range. This defaults to hpx::identity

  • Proj2 – The type of an optional projection function for the destination range. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • rng1 – Refers to the source range.

  • rng2 – Refers to the destination range.

  • pred – Specifies the binary predicate function (or function object) which will be invoked for comparison of the elements in the in two ranges projected by proj1 and proj2 respectively.

  • proj1 – Specifies the function (or function object) which will be invoked for each of the elements in the source range as a projection operation before the actual predicate is invoked.

  • proj2 – Specifies the function (or function object) which will be invoked for each of the elements in the destination range as a projection operation before the actual predicate is invoked.

Returns

The ends_with algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The ends_with algorithm returns a boolean with the value true if the second range matches the suffix of the first range, false otherwise.

hpx::ranges::equal#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace ranges

Functions

template<typename ExPolicy, typename Iter1, typename Sent1, typename Iter2, typename Sent2, typename Pred = equal_to, typename Proj1 = hpx::identity, typename Proj2 = hpx::identity>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, bool> equal(ExPolicy &&policy, Iter1 first1, Sent1 last1, Iter2 first2, Sent2 last2, Pred &&op = Pred(), Proj1 &&proj1 = Proj1(), Proj2 &&proj2 = Proj2())#

Returns true if the range [first1, last1) is equal to the range [first2, last2), and false otherwise.

The comparison operations in the parallel equal algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel equal algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most min(last1 - first1, last2 - first2) applications of the predicate f.

Note

The two ranges are considered equal if, for every iterator i in the range [first1,last1), *i equals *(first2 + (i - first1)). This overload of equal uses operator== to determine if two elements are equal.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • Iter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Sent1 – The type of the source iterators used for the end of the first range (deduced).

  • Iter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Sent2 – The type of the source iterators used for the end of the second range (deduced).

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of equal requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

  • Proj1 – The type of an optional projection function applied to the first range. This defaults to hpx::identity

  • Proj2 – The type of an optional projection function applied to the second range. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

  • op – The binary predicate which returns true if the elements should be treated as equal. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively

  • proj1 – Specifies the function (or function object) which will be invoked for each of the elements of the first range as a projection operation before the actual predicate is invoked.

  • proj2 – Specifies the function (or function object) which will be invoked for each of the elements of the second range as a projection operation before the actual predicate is invoked.

Returns

The equal algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The equal algorithm returns true if the elements in the two ranges are equal, otherwise it returns false. If the length of the range [first1, last1) does not equal the length of the range [first2, last2), it returns false.

template<typename ExPolicy, typename Rng1, typename Rng2, typename Pred = equal_to, typename Proj1 = hpx::identity, typename Proj2 = hpx::identity>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, bool> equal(ExPolicy &&policy, Rng1 &&rng1, Rng2 &&rng2, Pred &&op = Pred(), Proj1 &&proj1 = Proj1(), Proj2 &&proj2 = Proj2())#

Returns true if the range [first1, last1) is equal to the range starting at first2, and false otherwise.

The comparison operations in the parallel equal algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel equal algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most last1 - first1 applications of the predicate f.

Note

The two ranges are considered equal if, for every iterator i in the range [first1,last1), *i equals *(first2 + (i - first1)). This overload of equal uses operator== to determine if two elements are equal.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • Rng1 – The type of the first source range used (deduced). The iterators extracted from this range type must meet the requirements of an forward iterator.

  • Rng2 – The type of the second source range used (deduced). The iterators extracted from this range type must meet the requirements of an forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of equal requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

  • Proj1 – The type of an optional projection function applied to the first range. This defaults to hpx::identity

  • Proj2 – The type of an optional projection function applied to the second range. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • rng1 – Refers to the first sequence of elements the algorithm will be applied to.

  • rng2 – Refers to the second sequence of elements the algorithm will be applied to.

  • op – The binary predicate which returns true if the elements should be treated as equal. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively

  • proj1 – Specifies the function (or function object) which will be invoked for each of the elements of the first range as a projection operation before the actual predicate is invoked.

  • proj2 – Specifies the function (or function object) which will be invoked for each of the elements of the second range as a projection operation before the actual predicate is invoked.

Returns

The equal algorithm returns a hpx::future<bool> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns bool otherwise. The equal algorithm returns true if the elements in the two ranges are equal, otherwise it returns false.

template<typename Iter1, typename Sent1, typename Iter2, typename Sent2, typename Pred = equal_to, typename Proj1 = hpx::identity, typename Proj2 = hpx::identity>
bool equal(Iter1 first1, Sent1 last1, Iter2 first2, Sent2 last2, Pred &&op = Pred(), Proj1 &&proj1 = Proj1(), Proj2 &&proj2 = Proj2())#

Returns true if the range [first1, last1) is equal to the range [first2, last2), and false otherwise.

Note

Complexity: At most min(last1 - first1, last2 - first2) applications of the predicate f.

Note

The two ranges are considered equal if, for every iterator i in the range [first1,last1), *i equals *(first2 + (i - first1)). This overload of equal uses operator== to determine if two elements are equal.

Template Parameters
  • Iter1 – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Sent1 – The type of the source iterators used for the end of the first range (deduced).

  • Iter2 – The type of the source iterators used for the second range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Sent2 – The type of the source iterators used for the end of the second range (deduced).

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of equal requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

  • Proj1 – The type of an optional projection function applied to the first range. This defaults to hpx::identity

  • Proj2 – The type of an optional projection function applied to the second range. This defaults to hpx::identity

Parameters
  • first1 – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last1 – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • first2 – Refers to the beginning of the sequence of elements of the second range the algorithm will be applied to.

  • last2 – Refers to the end of the sequence of elements of the second range the algorithm will be applied to.

  • op – The binary predicate which returns true if the elements should be treated as equal. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively

  • proj1 – Specifies the function (or function object) which will be invoked for each of the elements of the first range as a projection operation before the actual predicate is invoked.

  • proj2 – Specifies the function (or function object) which will be invoked for each of the elements of the second range as a projection operation before the actual predicate is invoked.

Returns

The equal algorithm returns true if the elements in the two ranges are equal, otherwise it returns false. If the length of the range [first1, last1) does not equal the length of the range [first2, last2), it returns false.

template<typename Rng1, typename Rng2, typename Pred = equal_to, typename Proj1 = hpx::identity, typename Proj2 = hpx::identity>
bool equal(Rng1 &&rng1, Rng2 &&rng2, Pred &&op = Pred(), Proj1 &&proj1 = Proj1(), Proj2 &&proj2 = Proj2())#

Returns true if the range [first1, last1) is equal to the range starting at first2, and false otherwise.

Note

Complexity: At most last1 - first1 applications of the predicate f.

Note

The two ranges are considered equal if, for every iterator i in the range [first1,last1), *i equals *(first2 + (i - first1)). This overload of equal uses operator== to determine if two elements are equal.

Template Parameters
  • Rng1 – The type of the first source range used (deduced). The iterators extracted from this range type must meet the requirements of an forward iterator.

  • Rng2 – The type of the second source range used (deduced). The iterators extracted from this range type must meet the requirements of an forward iterator.

  • Pred – The type of an optional function/function object to use. Unlike its sequential form, the parallel overload of equal requires Pred to meet the requirements of CopyConstructible. This defaults to std::equal_to<>

  • Proj1 – The type of an optional projection function applied to the first range. This defaults to hpx::identity

  • Proj2 – The type of an optional projection function applied to the second range. This defaults to hpx::identity

Parameters
  • rng1 – Refers to the first sequence of elements the algorithm will be applied to.

  • rng2 – Refers to the second sequence of elements the algorithm will be applied to.

  • op – The binary predicate which returns true if the elements should be treated as equal. The signature of the predicate function should be equivalent to the following:

    bool pred(const Type1 &a, const Type2 &b);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The types Type1 and Type2 must be such that objects of types FwdIter1 and FwdIter2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively

  • proj1 – Specifies the function (or function object) which will be invoked for each of the elements of the first range as a projection operation before the actual predicate is invoked.

  • proj2 – Specifies the function (or function object) which will be invoked for each of the elements of the second range as a projection operation before the actual predicate is invoked.

Returns

The equal algorithm returns true if the elements in the two ranges are equal, otherwise it returns false.

hpx::ranges::exclusive_scan#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace ranges

Functions

template<typename InIter, typename Sent, typename OutIter, typename T = typename std::iterator_traits<InIter>::value_type, typename Op = std::plus<T>>
exclusive_scan_result<InIter, OutIter> exclusive_scan(InIter first, Sent last, OutIter dest, T init, Op &&op = Op())#

Assigns through each iterator i in [result, result + (last - first)) the value of GENERALIZED_NONCOMMUTATIVE_SUM(binary_op, init, *first, …, *(first + (i - result) - 1)).

The difference between exclusive_scan and inclusive_scan is that inclusive_scan includes the ith input element in the ith sum. If op is not mathematically associative, the behavior of inclusive_scan may be non-deterministic.

Note

Complexity: O(last - first) applications of the predicate op.

Note

GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aN) is defined as:

  • a1 when N is 1

  • op(GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aK), GENERALIZED_NONCOMMUTATIVE_SUM(op, aM, …, aN)) where 1 < K+1 = M <= N.

Template Parameters
  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for FwdIter1.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

  • Op – The type of the binary function object used for the reduction operation.

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to sentinel value denoting the end of the sequence of elements the algorithm will be applied.

  • dest – Refers to the beginning of the destination range.

  • init – The initial value for the generalized sum.

  • op – Specifies the function (or function object) which will be invoked for each of the values of the input sequence. This is a binary predicate. The signature of this predicate should be equivalent to:

    Ret fun(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The types Type1 and Ret must be such that an object of a type as given by the input sequence can be implicitly converted to any of those types.

Returns

The exclusive_scan algorithm returns an input iterator to the point denoted by the sentinel and an output iterator to the element in the destination range, one past the last element copied.

template<typename ExPolicy, typename FwdIter1, typename Sent, typename FwdIter2, typename T = typename std::iterator_traits<FwdIter1>::value_type, typename Op = std::plus<T>>
parallel::util::detail::algorithm_result<ExPolicy, exclusive_scan_result<FwdIter1, FwdIter2>>::type exclusive_scan(ExPolicy &&policy, FwdIter1 first, Sent last, FwdIter2 dest, T init, Op &&op = Op())#

Assigns through each iterator i in [result, result + (last - first)) the value of GENERALIZED_NONCOMMUTATIVE_SUM(binary_op, init, *first, …, *(first + (i - result) - 1)).

The reduce operations in the parallel exclusive_scan algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The reduce operations in the parallel exclusive_scan algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

The difference between exclusive_scan and inclusive_scan is that inclusive_scan includes the ith input element in the ith sum. If op is not mathematically associative, the behavior of inclusive_scan may be non-deterministic.

Note

Complexity: O(last - first) applications of the predicate op.

Note

GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aN) is defined as:

  • a1 when N is 1

  • op(GENERALIZED_NONCOMMUTATIVE_SUM(op, a1, …, aK), GENERALIZED_NONCOMMUTATIVE_SUM(op, aM, …, aN)) where 1 < K+1 = M <= N.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter1 – The type of the source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for FwdIter1.

  • FwdIter2 – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

  • Op – The type of the binary function object used for the reduction operation.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • last – Refers to sentinel value denoting the end of the sequence of elements the algorithm will be applied.

  • dest – Refers to the beginning of the destination range.

  • init – The initial value for the generalized sum.

  • op – Specifies the function (or function object) which will be invoked for each of the values of the input sequence. This is a binary predicate. The signature of this predicate should be equivalent to:

    Ret fun(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The types Type1 and Ret must be such that an object of a type as given by the input sequence can be implicitly converted to any of those types.

Returns

The exclusive_scan algorithm returns a hpx::future<util::in_out_result<FwdIter1, FwdIter2>> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns util::in_out_result<FwdIter1, FwdIter2> otherwise. The exclusive_scan algorithm returns an input iterator to the point denoted by the sentinel and an output iterator to the element in the destination range, one past the last element copied.

template<typename Rng, typename O, typename T = typename std::iterator_traits<hpx::traits::range_iterator_t<Rng>>::value_type, typename Op = std::plus<T>>
exclusive_scan_result<traits::range_iterator_t<Rng>, O> exclusive_scan(Rng &&rng, O dest, T init, Op &&op = Op())#

Assigns through each iterator i in [result, result + (last - first)) the value of GENERALIZED_NONCOMMUTATIVE_SUM(+, init, *first, …, *(first + (i - result) - 1))

The difference between exclusive_scan and inclusive_scan is that inclusive_scan includes the ith input element in the ith sum.

Note

Complexity: O(last - first) applications of the predicate std::plus<T>.

Note

GENERALIZED_NONCOMMUTATIVE_SUM(+, a1, …, aN) is defined as:

  • a1 when N is 1

  • GENERALIZED_NONCOMMUTATIVE_SUM(+, a1, …, aK)

    • GENERALIZED_NONCOMMUTATIVE_SUM(+, aM, …, aN) where 1 < K+1 = M <= N.

Template Parameters
  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an forward iterator.

  • O – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

  • Op – The type of the binary function object used for the reduction operation.

Parameters
  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • init – The initial value for the generalized sum.

  • op – Specifies the function (or function object) which will be invoked for each of the values of the input sequence. This is a binary predicate. The signature of this predicate should be equivalent to:

    Ret fun(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The types Type1 and Ret must be such that an object of a type as given by the input sequence can be implicitly converted to any of those types.

Returns

The exclusive_scan algorithm returns an input iterator to the point denoted by the sentinel and an output iterator to the element in the destination range, one past the last element copied.

template<typename ExPolicy, typename Rng, typename O, typename T = typename std::iterator_traits<hpx::traits::range_iterator_t<Rng>>::value_type, typename Op = std::plus<T>>
parallel::util::detail::algorithm_result<ExPolicy, exclusive_scan_result<traits::range_iterator_t<Rng>, O>> exclusive_scan(ExPolicy &&policy, Rng &&rng, O dest, T init, Op &&op = Op())#

Assigns through each iterator i in [result, result + (last - first)) the value of GENERALIZED_NONCOMMUTATIVE_SUM(+, init, *first, …, *(first + (i - result) - 1))

The reduce operations in the parallel exclusive_scan algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The reduce operations in the parallel exclusive_scan algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

The difference between exclusive_scan and inclusive_scan is that inclusive_scan includes the ith input element in the ith sum.

Note

Complexity: O(last - first) applications of the predicate std::plus<T>.

Note

GENERALIZED_NONCOMMUTATIVE_SUM(+, a1, …, aN) is defined as:

  • a1 when N is 1

  • GENERALIZED_NONCOMMUTATIVE_SUM(+, a1, …, aK)

    • GENERALIZED_NONCOMMUTATIVE_SUM(+, aM, …, aN) where 1 < K+1 = M <= N.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an forward iterator.

  • O – The type of the iterator representing the destination range (deduced). This iterator type must meet the requirements of an forward iterator.

  • T – The type of the value to be used as initial (and intermediate) values (deduced).

  • Op – The type of the binary function object used for the reduction operation.

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • dest – Refers to the beginning of the destination range.

  • init – The initial value for the generalized sum.

  • op – Specifies the function (or function object) which will be invoked for each of the values of the input sequence. This is a binary predicate. The signature of this predicate should be equivalent to:

    Ret fun(const Type1 &a, const Type1 &b);
    
    The signature does not need to have const&, but the function must not modify the objects passed to it. The types Type1 and Ret must be such that an object of a type as given by the input sequence can be implicitly converted to any of those types.

Returns

The exclusive_scan algorithm returns a hpx::future<util::in_out_result <traits::range_iterator_t<Rng>, O>> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns util::in_out_result <traits::range_iterator_t<Rng>, O> otherwise. The exclusive_scan algorithm returns an input iterator to the point denoted by the sentinel and an output iterator to the element in the destination range, one past the last element copied.

hpx::ranges::fill, hpx::ranges::fill_n#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace ranges

Functions

template<typename ExPolicy, typename Rng, typename T = typename std::iterator_traits<hpx::traits::range_iterator_t<Rng>>::value_type>
hpx::parallel::util::detail::algorithm_result<ExPolicy, typename hpx::traits::range_traits<Rng>::iterator_type>::type fill(ExPolicy &&policy, Rng &&rng, T const &value)#

Assigns the given value to the elements in the range [first, last).

The comparisons in the parallel fill algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparisons in the parallel fill algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • T – The type of the value to be assigned (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • value – The value to be assigned.

Returns

The fill algorithm returns a hpx::future<void> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns difference_type otherwise (where difference_type is defined by void.

template<typename ExPolicy, typename Iter, typename Sent, typename T = typename std::iterator_traits<Iter>::value_type>
hpx::parallel::util::detail::algorithm_result<ExPolicy, Iter>::type fill(ExPolicy &&policy, Iter first, Sent last, T const &value)#

Assigns the given value to the elements in the range [first, last).

The comparisons in the parallel fill algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparisons in the parallel fill algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • Iter – The type of the source iterators used for the range (deduced).

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for InIter.

  • T – The type of the value to be assigned (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of the range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the range the algorithm will be applied to.

  • value – The value to be assigned.

Returns

The fill algorithm returns a hpx::future<void> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns difference_type otherwise (where difference_type is defined by void.

template<typename Rng, typename T = typename std::iterator_traits<hpx::traits::range_iterator_t<Rng>>::value_type>
hpx::traits::range_iterator_t<Rng> fill(Rng &&rng, T const &value)#

Assigns the given value to the elements in the range [first, last).

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • T – The type of the value to be assigned (deduced).

Parameters
  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • value – The value to be assigned.

Returns

The fill algorithm returns void.

template<typename Iter, typename Sent, typename T = typename std::iterator_traits<Iter>::value_type>
Iter fill(Iter first, Sent last, T const &value)#

Assigns the given value to the elements in the range [first, last).

Note

Complexity: Performs exactly last - first assignments.

Template Parameters
  • Iter – The type of the source iterators used for the range (deduced).

  • Sent – The type of the source sentinel (deduced). This sentinel type must be a sentinel for InIter.

  • T – The type of the value to be assigned (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements of the range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the range the algorithm will be applied to.

  • value – The value to be assigned.

Returns

The fill algorithm returns void.

template<typename ExPolicy, typename Rng, typename T = typename std::iterator_traits<hpx::traits::range_iterator_t<Rng>>::value_type>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, hpx::traits::range_iterator_t<Rng>> fill_n(ExPolicy &&policy, Rng &&rng, T const &value)#

Assigns the given value value to the first count elements in the range beginning at first if count > 0. Does nothing otherwise.

The comparisons in the parallel fill_n algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparisons in the parallel fill_n algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • T – The type of the value to be assigned (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • value – The value to be assigned.

Returns

The fill_n algorithm returns a hpx::future<void> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns difference_type otherwise (where difference_type is defined by void.

template<typename ExPolicy, typename FwdIter, typename Size, typename T = typename std::iterator_traits<FwdIter>::value_type>
hpx::parallel::util::detail::algorithm_result<ExPolicy, FwdIter>::type fill_n(ExPolicy &&policy, FwdIter first, Size count, T const &value)#

Assigns the given value value to the first count elements in the range beginning at first if count > 0. Does nothing otherwise.

The comparisons in the parallel fill_n algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparisons in the parallel fill_n algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: Performs exactly count assignments, for count > 0.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • FwdIter – The type of the source iterators used for the range (deduced). This iterator type must meet the requirements of an forward iterator.

  • Size – The type of the argument specifying the number of elements to apply f to.

  • T – The type of the value to be assigned (deduced).

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

  • value – The value to be assigned.

Returns

The fill_n algorithm returns a hpx::future<void> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns difference_type otherwise (where difference_type is defined by void.

template<typename Rng, typename T = typename std::iterator_traits<hpx::traits::range_iterator_t<Rng>>::value_type>
hpx::traits::range_traits<Rng>::iterator_type fill_n(Rng &&rng, T const &value)#

Assigns the given value value to the first count elements in the range beginning at first if count > 0. Does nothing otherwise.

Template Parameters
  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • T – The type of the value to be assigned (deduced).

Parameters
  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • value – The value to be assigned.

Returns

The fill_n algorithm returns an output iterator that compares equal to last.

template<typename FwdIter, typename Size, typename T = typename std::iterator_traits<FwdIter>::value_type>
FwdIter fill_n(Iterator first, Size count, T const &value)#

Assigns the given value value to the first count elements in the range beginning at first if count > 0. Does nothing otherwise.

Note

Complexity: Performs exactly count assignments, for count > 0.

Template Parameters
  • Iterator – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an forward iterator.

  • Size – The type of the argument specifying the number of elements to apply f to.

  • T – The type of the value to be assigned (deduced).

Parameters
  • first – Refers to the beginning of the sequence of elements the algorithm will be applied to.

  • count – Refers to the number of elements starting at first the algorithm will be applied to.

  • value – The value to be assigned.

Returns

The fill_n algorithm returns an output iterator that compares equal to last.

hpx::ranges::find, hpx::ranges::find_if, hpx::ranges::find_if_not, hpx::ranges::find_end, hpx::ranges::find_first_of#

Defined in header hpx/algorithm.hpp.

See Public API for a list of names and headers that are part of the public HPX API.

namespace hpx
namespace ranges

Functions

template<typename ExPolicy, typename Iter, typename Sent, typename Proj = hpx::identity, typename T = typename hpx::parallel::traits::projected<Iter, Proj>::value_type>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, Iter> find(ExPolicy &&policy, Iter first, Sent last, T const &val, Proj &&proj = Proj())#

Returns the first element in the range [first, last) that is equal to value

The comparison operations in the parallel find algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel find algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most last - first applications of the operator==().

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • Iter – The type of the begin source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Sent – The type of the end source iterators used (deduced). This iterator type must meet the requirements of an sentinel for Iter.

  • T – The type of the value to find (deduced).

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • val – the value to compare the elements to

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The find algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The find algorithm returns the first element in the range [first,last) that is equal to val. If no such element in the range of [first,last) is equal to val, then the algorithm returns last.

template<typename ExPolicy, typename Rng, typename Proj = hpx::identity, typename T = typename hpx::parallel::traits::projected<hpx::traits::range_iterator_t<Rng>, Proj>::value_type>
hpx::parallel::util::detail::algorithm_result<ExPolicy, hpx::traits::range_iterator_t<Rng>> find(ExPolicy &&policy, Rng &&rng, T const &val, Proj &&proj = Proj())#

Returns the first element in the range [first, last) that is equal to value

The comparison operations in the parallel find algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel find algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most last - first applications of the operator==().

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • T – The type of the value to find (deduced).

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • val – the value to compare the elements to

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The find algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The find algorithm returns the first element in the range [first,last) that is equal to val. If no such element in the range of [first,last) is equal to val, then the algorithm returns last.

template<typename Iter, typename Sent, typename Proj = hpx::identity, typename T = typename hpx::parallel::traits::projected<Iter, Proj>::value_type>
Iter find(Iter first, Sent last, T const &val, Proj &&proj = Proj())#

Returns the first element in the range [first, last) that is equal to value

Note

Complexity: At most last - first applications of the operator==().

Template Parameters
  • Iter – The type of the begin source iterators used (deduced). This iterator type must meet the requirements of an forward iterator.

  • Sent – The type of the end source iterators used (deduced). This iterator type must meet the requirements of an sentinel for Iter.

  • T – The type of the value to find (deduced).

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • first – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • val – the value to compare the elements to

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The find algorithm returns the first element in the range [first,last) that is equal to val. If no such element in the range of [first,last) is equal to val, then the algorithm returns last.

template<typename Rng, typename Proj = hpx::identity, typename T = typename hpx::parallel::traits::projected<hpx::traits::range_iterator_t<Rng>, Proj>::value_type>
hpx::traits::range_iterator_t<Rng> find(Rng &&rng, T const &val, Proj &&proj = Proj())#

Returns the first element in the range [first, last) that is equal to value

The comparison operations in the parallel find algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel find algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most last - first applications of the operator==().

Template Parameters
  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • T – The type of the value to find (deduced).

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • val – the value to compare the elements to

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The find algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The find algorithm returns the first element in the range [first,last) that is equal to val. If no such element in the range of [first,last) is equal to val, then the algorithm returns last.

template<typename ExPolicy, typename Iter, typename Sent, typename Pred, typename Proj = hpx::identity>
hpx::parallel::util::detail::algorithm_result_t<ExPolicy, Iter> find_if(ExPolicy &&policy, Iter first, Sent last, Pred &&pred, Proj &&proj = Proj())#

Returns the first element in the range [first, last) for which predicate pred returns true

The comparison operations in the parallel find_if algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel find_if algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most last - first applications of the predicate.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • Iter – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of a forward iterator.

  • Sent – The type of the end source iterators used (deduced). This iterator type must meet the requirements of an sentinel for Iter.

  • Pred – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of equal requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • first – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • pred – The unary predicate which returns true for the required element. The signature of the predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The find_if algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The find_if algorithm returns the first element in the range [first,last) that satisfies the predicate f. If no such element exists that satisfies the predicate f, the algorithm returns last.

template<typename ExPolicy, typename Rng, typename Pred, typename Proj = hpx::identity>
hpx::parallel::util::detail::algorithm_result<ExPolicy, hpx::traits::range_iterator_t<Rng>> find_if(ExPolicy &&policy, Rng &&rng, Pred &&pred, Proj &&proj = Proj())#

Returns the first element in the range rng for which predicate pred returns true

The comparison operations in the parallel find_if algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel find_if algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most last - first applications of the predicate.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in which it executes the assignments.

  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • Pred – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of equal requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • policy – The execution policy to use for the scheduling of the iterations.

  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • pred – The unary predicate which returns true for the required element. The signature of the predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The find_if algorithm returns a hpx::future<FwdIter> if the execution policy is of type sequenced_task_policy or parallel_task_policy and returns FwdIter otherwise. The find_if algorithm returns the first element in the range [first,last) that satisfies the predicate f. If no such element exists that satisfies the predicate f, the algorithm returns last.

template<typename Iter, typename Sent, typename Pred, typename Proj = hpx::identity>
Iter find_if(Iter first, Sent last, Pred &&pred, Proj &&proj = Proj())#

Returns the first element in the range [first, last) for which predicate pred returns true

Note

Complexity: At most last - first applications of the predicate.

Template Parameters
  • Iter – The type of the source iterators used for the first range (deduced). This iterator type must meet the requirements of a forward iterator.

  • Sent – The type of the end source iterators used (deduced). This iterator type must meet the requirements of an sentinel for Iter.

  • Pred – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of equal requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • first – Refers to the beginning of the sequence of elements of the first range the algorithm will be applied to.

  • last – Refers to the end of the sequence of elements of the first range the algorithm will be applied to.

  • pred – The unary predicate which returns true for the required element. The signature of the predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The find_if algorithm returns the first element in the range [first,last) that satisfies the predicate f. If no such element exists that satisfies the predicate f, the algorithm returns last.

template<typename Rng, typename Pred, typename Proj = hpx::identity>
hpx::traits::range_iterator_t<Rng> find_if(Rng &&rng, Pred &&pred, Proj &&proj = Proj())#

Returns the first element in the range rng for which predicate pred returns true

Note

Complexity: At most last - first applications of the predicate.

Template Parameters
  • Rng – The type of the source range used (deduced). The iterators extracted from this range type must meet the requirements of an input iterator.

  • Pred – The type of the function/function object to use (deduced). Unlike its sequential form, the parallel overload of equal requires F to meet the requirements of CopyConstructible.

  • Proj – The type of an optional projection function. This defaults to hpx::identity

Parameters
  • rng – Refers to the sequence of elements the algorithm will be applied to.

  • pred – The unary predicate which returns true for the required element. The signature of the predicate should be equivalent to:

    bool pred(const Type &a);
    
    The signature does not need to have const &, but the function must not modify the objects passed to it. The type Type must be such that objects of type FwdIter can be dereferenced and then implicitly converted to Type.

  • proj – Specifies the function (or function object) which will be invoked for each of the elements as a projection operation before the actual predicate is invoked.

Returns

The find_if algorithm returns the first element in the range [first,last) that satisfies the predicate f. If no such element exists that satisfies the predicate f, the algorithm returns last.

template<typename ExPolicy, typename Iter, typename Sent, typename Pred, typename Proj = hpx::identity>
hpx::parallel::util::detail::algorithm_result<ExPolicy, Iter>::type find_if_not(ExPolicy &&policy, Iter first, Sent last, Pred &&pred, Proj &&proj = Proj())#

Returns the first element in the range [first, last) for which predicate f returns false

The comparison operations in the parallel find_if_not algorithm invoked with an execution policy object of type sequenced_policy execute in sequential order in the calling thread.

The comparison operations in the parallel find_if_not algorithm invoked with an execution policy object of type parallel_policy or parallel_task_policy are permitted to execute in an unordered fashion in unspecified threads, and indeterminately sequenced within each thread.

Note

Complexity: At most last - first applications of the predicate.

Template Parameters
  • ExPolicy – The type of the execution policy to use (deduced). It describes the manner in which the execution of the algorithm may be parallelized and the manner in whic