HPX V1.9.0 (May 2, 2023)#

General changes#

  • Added RISC-V 64bit support. HPX is now compatible with RISC-V architectures which have revolutionized the HPC world.

  • LCI parcelport has been optimized to transfer parcels with fewer messages and use the HPX resource partitioner for its progress thread allocation. It should generally provide better performance than before. It also removes its dependency on the MPI library.

  • HPX dependency on Boost was further relaxed by replacing headers from Boost.Range, Boost.Tokenizer and Boost.Lockfree.

  • Improvements took place on our parallel algorithms implementation.

  • Our Senders/Receivers (P2300) integration was extended:

    • Coroutines were integrated with senders/receivers.

    get_completion_signatures now works with awaitable senders. - with_awaitable_senders allows the passed senders to retrieve the value i.e. senders are transparently awaitable from within a coroutine. - when_all_vector was added.

  • sync_wait and sync_wait_with_variant sender consumers were added. The user can now initiate the execution of their asynchronous pipeline by blocking the current thread that executes the main() function until the result is retrieved.

  • The combinators for futures (a.k.a. async_combinators) when_*, wait_*, wait_*_nothrow were turned into CPOs allowing for end-user customization. For more information on the async_combinators refer to the documentation, https://hpx-docs.stellar-group.org/latest/html/libs/core/async_combinators/docs/index.html?highlight=combinators.

  • The new datapar backend SVE allows simd and par_simd execution policies to exploit dataparalleism in the processors that have SVE vector registers like A64FX and Neoverse V1.

  • The documentation for parallel algorithms, container algorithms was further improved. The Public API page was vastly enriched.

  • Copy button shortkey was added at the top-right of code-blocks.

  • Pragma directive that reports warnings as errors on MSVC was fixed.

  • Command line argument --hpx:loopback_network was added to facilitie debugging with networks.

  • We added an HPX-SYCL integration, allowing users to obtain HPX futures for SYCL events. This effectively enables the integration of arbitrary asynchronous SYCL operations into the HPX task graph. Bolted on top of this integration, we further added an HPX-SYCL executor for ease of use.

Breaking changes#

  • Stopped supporting Clang V8, the minimal version supported is now Clang V10.

  • Stopped supporting gcc V8, the minimal version supported is now gcc V9.

  • Stopped supporting Visual Studio 2015, the minimal version supported is now Visual Studio 2019.

  • tag_policy_tag et.al. were re-added after HPX V1.8.1 depracation.

  • get_chunk_size and processing_units_count API is now expecting the time for one iteration as an argument.

  • The list of all the namespace changes can be found here: HPX V1.9.0 Namespace changes.

Closed issues#

  • Issue #6203 - Compilation error with -mcpu=a64fx on Ookami

  • Issue #6196 - Incorrect log destination

  • Issue #6191 - installing HPX

  • Issue #6184 - Wrong processing_units_count of restricted_thread_pool_executor

  • Issue #6171 - Release Tag Name Request

  • Issue #6162 - Current master does not compile on ROSTAM

  • Issue #6156 - hpxcxx does not work if HPX_WITH_PKGCONFIG=OFF

  • Issue #6108 - cxx17_aligned_new.cpp on msvc fails due to wrong pragma directive

  • Issue #6045 - Can’t call nullary callables wrapped with hpx::unwrapping

  • Issue #6013 - Unable to build subprojects hpx_collectives/hpx_compute with MSVC

  • Issue #6008 - Missing constexpr default constructor for hpx::mutex

  • Issue #5999 - Add HPX Conda package to conda-forge

  • Issue #5998 - Serializing multiple arguments when applying distributed action results in segfault

  • Issue #5958 - HPX 1.8.0 and Blaze issues

  • Issue #5908 - Windows: duplicated symbols in static builds

  • Issue #5802 - Lost status is_ready from future

  • Issue #5767 - Performance drop on Piz Daint

  • Issue #5752 - Implement stride_view from P1899 (experimental)

  • Issue #5744 - HPX_WITH_FETCH_ASIO not working on Ookami

  • Issue #5561 - Possible race condition in helper thread / hpx::cout

Closed pull requests#

  • PR #6228 - Fixing algorithms for zero length sequences when run with s/r scheduler

  • PR #6227 - Reliably disable background work when no networking is enabled

  • PR #6225 - Make heap fails in par for small sized heaps #6217

  • PR #6222 - Add documentation for hpx::post

  • PR #6221 - Fix segmented algorithms tests

  • PR #6218 - Creating INSTALL component ‘runtime’ to enable installing binaries only

  • PR #6216 - added tests for set_difference, updated set_operation.hpp to fix #6198

  • PR #6213 - Modernize and streamline MPI parcelport

  • PR #6211 - Modernize modules of level 11, 12, and 13

  • PR #6210 - Fixing MPI parcelport initialization if MPI is initialized outside of HPX

  • PR #6209 - Prevent thread stealing during scheduler shutdown

  • PR #6208 - Fix the compilation warning in the MPI parcelport with gcc 11.2

  • PR #6207 - Automatically enable Boost.Context when compiling for arm64.

  • PR #6206 - Update CMakeLists.txt

  • PR #6205 - Do not generate hpxcxx if support for pkgconfig was disabled

  • PR #6204 - Use LRT_ instead of LAPP_ logging in barrier implementation

  • PR #6202 - Fixing Fedora build errors on Power systems

  • PR #6201 - Update the LCI parcelport documents

  • PR #6200 - Par link jobs

  • PR #6197 - LCI parcelport: add doc, upgrade to v1.7.4, refactor cmake autofetch.

  • PR #6195 - Change the default tag of autofetch LCI to v1.7.3.

  • PR #6192 - Fix page Writing single-node applications

  • PR #6189 - Making sure restricted_thread_pool_executor properly reports used number of cores

  • PR #6187 - Enable using for_loop with range generators

  • PR #6186 - thread_support/CMakeLists: Fix build issue

  • PR #6185 - Fix EVE datapar with cxx_standard less than 20

  • PR #6183 - Update CI integration for EVE

  • PR #6182 - Fixing performance regressions

  • PR #6181 - LCI parcelport: backlog queue, aggregation, separate devices, and more

  • PR #6180 - Fixing use of for_loop with rebound execution policy (using .with())

  • PR #6179 - Taking predicates for algorithms by value

  • PR #6178 - Changes needed to make chapel_hpx examples work

  • PR #6176 - Fixing warnings that were generated by PVS Studio

  • PR #6174 - Replace boost::integer::gcd with std::gcd

  • PR #6172 - [Docs] Fix example of how to run single/specific test(s)

  • PR #6170 - Adding missing fallback for processing_units_count customization point

  • PR #6169 - LCI parcelport: bypass the parcel queue and connection cache.

  • PR #6167 - Add create_local_communicator API function

  • PR #6166 - Add missing header for std::intmax_t

  • PR #6165 - Attempt to work around MSVC problem

  • PR #6161 - Update EVE integration

  • PR #6160 - More cleanup for module levels 0 to 10

  • PR #6159 - Fix minor spelling mistake in generate_issue_pr_list.sh

  • PR #6158 - Update documentation in writing single-node applications page

  • PR #6157 - Improve index_queue_spawning

  • PR #6154 - Avoid performing late command line handling twice in distributed runtime

  • PR #6152 - The -rd and -mr options didn’t work, and they should have been –rd and –mr

  • PR #6151 - Refactoring the Manual page in documentation

  • PR #6148 - Investigate the failure of the LCI parcelport.

  • PR #6147 - Make posix co-routine stacks non-executable

  • PR #6146 - Avoid ambiguities wrt tag_invoke

  • PR #6144 - General improvements to scheduling and related fixes

  • PR #6143 - Add list of new namespaces for new release

  • PR #6140 - Fixing background scheduler to properly exit in the end

  • PR #6139 - [P2300] execution: Cleanup coroutines integration and improve ADL isolation

  • PR #6137 - Adding example of a simple master/slave distributed application

  • PR #6136 - Deprecate execution::experimental::task_group in favor of experimental::task_group

  • PR #6135 - Fixing warnings reported by MSVC analysis

  • PR #6134 - Adding notification function for parcelports to be called after early parcel handling

  • PR #6132 - Fixing to_non_par() for parallel simd policies

  • PR #6131 - modernize modules from level 25

  • PR #6130 - Remove the mutex lock in the critical path of get_partitioner.

  • PR #6129 - Modernize module from levels 22, 23

  • PR #6127 - Working around gccV9 problem that prevent us from storing enum classes in bit fields

  • PR #6126 - Deprecate hpx::parallel::task_block in favor of hpx::experimental::ta…

  • PR #6125 - Making sure sync_wait compiles when used with an lvalue sender involving bulk

  • PR #6124 - Fixing use of any_sender in combination with when_all

  • PR #6123 - Fixed issues found by PVS-Studio

  • PR #6121 - Modernize modules of level 21, 22

  • PR #6120 - Use index_queue for parallel executors bulk_async_execute

  • PR #6119 - Update CMakeLists.txt

  • PR #6118 - Modernize modules from level 17, 18, 19, and 20

  • PR #6117 - Initialize buffer_allocate_time_ to 0

  • PR #6116 - Add new command line argument –hpx:loopback_network

  • PR #6115 - Modernize modules of levels 14, 15, and 16

  • PR #6114 - Enhance the formatting of the documentation

  • PR #6113 - Modernize modules in module level 11, 12, and 13

  • PR #6112 - Modernize modules from levels 9 and 10

  • PR #6111 - Modernize all modules from module level 8

  • PR #6110 - Use pragma error directive to report warnings as errors on msvc

  • PR #6109 - Modernize serialization module

  • PR #6107 - Modernize error module

  • PR #6106 - Modernizing modules of levels 0 to 5

  • PR #6105 - Optimizations on LCI parcelport: merge small messages; remove sender mutex lock.

  • PR #6104 - Adding parameters API: measure_iteration

  • PR #6103 - Document task_group and include in Public API

  • PR #6102 - Prevent warnings generated by clang-cl

  • PR #6101 - Using more fold expressions

  • PR #6100 - Deprecate hpx::parallel::reduce_by_key in favor of hpx::experimental::reduce_by_key

  • PR #6098 - Forking Boost.Lockfree

  • PR #6096 - Forking Boost.Tokenizer

  • PR #6095 - Replacing facilities from Boost.Range

  • PR #6094 - Removing object_semaphore

  • PR #6093 - Replace boost::string_ref with std::string_view

  • PR #6092 - Use C++17 static_assert where possible

  • PR #6091 - Replace artificial sequencing with fold expressions

  • PR #6090 - Fixing use of get_chunk_size customization point

  • PR #6088 - Add/fix Public API documentation

  • PR #6086 - Deprecate hpx::util::unlock_guard in favor of hpx::unlock_guard

  • PR #6085 - Add experimental sycl integration/executor

  • PR #6084 - Renaming hpx::apply and friends to hpx::post

  • PR #6083 - Using if constexpr instead of tag-dispatching, where possible

  • PR #6082 - Replace util::always_void_t with std::void_t

  • PR #6081 - Update github actions to avoid warnings

  • PR #6080 - Disable some tests that fail on LCI

  • PR #6079 - Adding more natvis files, correct existing

  • PR #6078 - Changing target name of memory_counters component

  • PR #6077 - Making default constructor of hpx::mutex constexpr

  • PR #6076 - Cleaning up functionality that was deprecated in V1.7

  • PR #6075 - Remove conditional code for gcc V7 and below

  • PR #6074 - Fixing compilation issues on gcc V8

  • PR #6073 - Fixing PAPI counter component compilation

  • PR #6072 - Adding ex::when_all_vector

  • PR #6071 - Making get_forward_progress_guarantee_t specializations constexpr

  • PR #6070 - Implement P2690 for our algorithms

  • PR #6069 - Do not check for cancellation during each iteration but only once per partition

  • PR #6068 - Prevent using task and non_task as a CPO

  • PR #6067 - Deprecated hpx::util::mem_fn in favor of hpx::mem_fn

  • PR #6066 - Create codeql.yml

  • PR #6064 - Adapting adjacent_difference for S/R execution

  • PR #6063 - Modernize iterator_support module

  • PR #6062 - Make sure wrapping executor does not go out of scope prematurely

  • PR #6061 - Minor fix in small_vector (from upstream)

  • PR #6060 - Allow to disable registering signal handlers

  • PR #6059 - [P2300] Fix: declval cannot be ODR used

  • PR #6058 - Avoid ambiguity for hpx::get used with std::variant

  • PR #6057 - Create a dedicated thread pool to run LCI_progress.

  • PR #6056 - Fix coroutine test for clang

  • PR #6055 - Patches needed to be able to build HPX 1.8.1 on various platforms

  • PR #6054 - Use MSVC specific attribute [[msvc::no_unique_address]]

  • PR #6052 - Deprecated hpx::util::invoke_fused in favor of hpx::invoke_fused

  • PR #6051 - Add non-contiguous index queue and use it in thread_pool_bulk_scheduler

  • PR #6049 - Crosscompile arm sve

  • PR #6048 - Deprecated hpx::util::invoke in favor of hpx::invoke

  • PR #6047 - Separating binary_semaphore into its own file

  • PR #6046 - Support using unwrapping with nullary function objects

  • PR #6044 - Generalize the use of then() and dataflow

  • PR #6043 - Clean up scan_partitioner

  • PR #6042 - Modernize dataflow API

  • PR #6041 - docs: document semaphores

  • PR #6040 - Add/Fix documentation of Public API page

  • PR #6039 - remove MPI dependency when only using LCI parcelport

  • PR #6038 - Clean up command line handling

  • PR #6037 - Avoid performing parcel related background work if networking is disabled

  • PR #6036 - Support new datapar backend : SVE

  • PR #6035 - Simplify datapar replace copy if

  • PR #6034 - Add/Fix documentation of Public API

  • PR #6033 - Support for data-parallelism for replace, replace_if, replace_copy, replace_copy_if algorithms

  • PR #6032 - Add documentation in public API

  • PR #6031 - Expose available cache sizes from topology object

  • PR #6030 - Adding parcelport initialization hook for resource partitioner operation

  • PR #6029 - Simplify startup code

  • PR #6027 - Add/Fix documentation in Public API page

  • PR #6026 - add option hpx:force_ipv4 to force resolving hostnames to ipv4 adresses

  • PR #6025 - build(docs): remove leftover sections

  • PR #6023 - Minor fixes on “How to build on Windows”

  • PR #6022 - build(doxy): don’t extract private members

  • PR #6021 - Adding pu_mask to thread_pool_bulk_scheduler

  • PR #6020 - docs: add cppref NamedRequirements support

  • PR #6018 - Unseq adaptation for for_each, transform, reduce, transform_reduce, etc.

  • PR #6017 - loop and transform_loop unseq adaptation

  • PR #6016 - Config and structural updates to support unseq implementation

  • PR #6015 - Integrating sync_wait & sync_wait_with_variant

  • PR #6012 - docs: add missing links to public api

  • PR #6009 - Fixing sender&receiver integration with for_each and for_loop

  • PR #6007 - docs: add docs for mutex.hpp

  • PR #6006 - Relax future::is_ready where possible

  • PR #6005 - reshuffle header tests to different instances

  • PR #6004 - Add documentation Public API

  • PR #6003 - Always exporting get_component_name implementations

  • PR #6002 - Making sure that default constructble arguments are properly constructed during deserialization

  • PR #5996 - Add back explicit template parameters to lock_guards for nvcc

  • PR #5994 - Fix CTRL+C on windows

  • PR #5993 - Using EVE requires C++20

  • PR #5992 - This properly terminates an application on Ctrl-C on Windows

  • PR #5991 - Support IPV6 on command line for explicit network initialization

  • PR #5990 - P2300 enhancements

  • PR #5989 - Fix missing documentation in Public API page

  • PR #5987 - Attempting to fix timed executor API

  • PR #5986 - Fix warnings when building docs

  • PR #5985 - Re-add deprecated tag_policy_tag et.al. types that were removed in V1.8.1

  • PR #5981 - docs: add docs for condition_variable.hpp

  • PR #5980 - More work on execution::read

  • PR #5979 - Unsupport clang-v8 and clang-v9, switch LSU clang-v13 to C++17

  • PR #5977 - fix: Compilation errors for -std=c++17 builders

  • PR #5975 - docs: fix & improve parallel algorithms documentation 5

  • PR #5974 - [P2300] Adapt get completion signatures for awaitable senders

  • PR #5973 - defaults boost.context on riscv64

  • PR #5972 - Fix documentation for container algorithms

  • PR #5971 - added logic to detect riscv compiler configured for 64 bit target

  • PR #5968 - adds risc-v 64 bit support

  • PR #5967 - Adding missing pieces to sync_wait, adding run_loop

  • PR #5966 - docs: fix & improve parallel algorithms documentation 4

  • PR #5965 - Fixing inspect problems, adding missing header file

  • PR #5962 - Changes in html page of documentation

  • PR #5961 - Prevent stalling during shutdown when running hello_world_distributed

  • PR #5955 - Fix documentation for container algorithms

  • PR #5952 - docs: fix & improve parallel algorithms documentation 3

  • PR #5950 - Change executors to directly implement the executor CPOs

  • PR #5949 - Converting async combinators into CPOs

  • PR #5948 - Adding support for pure sender/receiver based executors to parallel algorithms

  • PR #5945 - [P2300] Added fundamental coroutine_traits for S/R

  • PR #5883 - Optimization on LCI parcelport: uses LCI_putva

  • PR #5872 - Block fork join executor

  • PR #5855 - Adding performance test Jenkins builder at LSU