HPX V1.9.0 (May 2, 2023)
Contents
HPX V1.9.0 (May 2, 2023)#
General changes#
Added RISC-V 64bit support. HPX is now compatible with RISC-V architectures which have revolutionized the HPC world.
LCI parcelport has been optimized to transfer parcels with fewer messages and use the HPX resource partitioner for its progress thread allocation. It should generally provide better performance than before. It also removes its dependency on the MPI library.
HPX dependency on Boost was further relaxed by replacing headers from Boost.Range, Boost.Tokenizer and Boost.Lockfree.
Improvements took place on our parallel algorithms implementation.
Our Senders/Receivers (P2300) integration was extended:
Coroutines were integrated with senders/receivers.
get_completion_signatures
now works with awaitable senders. -with_awaitable_senders
allows the passed senders to retrieve the value i.e. senders are transparently awaitable from within a coroutine. -when_all_vector
was added.sync_wait
andsync_wait_with_variant
sender consumers were added. The user can now initiate the execution of their asynchronous pipeline by blocking the current thread that executes the main() function until the result is retrieved.The combinators for futures (a.k.a. async_combinators)
when_*
,wait_*
,wait_*_nothrow
were turned into CPOs allowing for end-user customization. For more information on the async_combinators refer to the documentation, https://hpx-docs.stellar-group.org/latest/html/libs/core/async_combinators/docs/index.html?highlight=combinators.The new datapar backend SVE allows simd and par_simd execution policies to exploit dataparalleism in the processors that have SVE vector registers like A64FX and Neoverse V1.
The documentation for parallel algorithms, container algorithms was further improved. The Public API page was vastly enriched.
Copy button shortkey was added at the top-right of code-blocks.
Pragma directive that reports warnings as errors on MSVC was fixed.
Command line argument
--hpx:loopback_network
was added to facilitie debugging with networks.We added an HPX-SYCL integration, allowing users to obtain HPX futures for SYCL events. This effectively enables the integration of arbitrary asynchronous SYCL operations into the HPX task graph. Bolted on top of this integration, we further added an HPX-SYCL executor for ease of use.
Breaking changes#
Stopped supporting Clang V8, the minimal version supported is now Clang V10.
Stopped supporting gcc V8, the minimal version supported is now gcc V9.
Stopped supporting Visual Studio 2015, the minimal version supported is now Visual Studio 2019.
tag_policy_tag
et.al. were re-added after HPX V1.8.1 depracation.get_chunk_size
andprocessing_units_count
API is now expecting the time for one iteration as an argument.The list of all the namespace changes can be found here: HPX V1.9.0 Namespace changes.
Closed issues#
Issue #6203 - Compilation error with -mcpu=a64fx on Ookami
Issue #6196 - Incorrect log destination
Issue #6191 - installing HPX
Issue #6184 - Wrong processing_units_count of restricted_thread_pool_executor
Issue #6171 - Release Tag Name Request
Issue #6162 - Current master does not compile on ROSTAM
Issue #6156 - hpxcxx does not work if HPX_WITH_PKGCONFIG=OFF
Issue #6108 - cxx17_aligned_new.cpp on msvc fails due to wrong pragma directive
Issue #6045 - Can’t call nullary callables wrapped with hpx::unwrapping
Issue #6013 - Unable to build subprojects hpx_collectives/hpx_compute with MSVC
Issue #6008 - Missing constexpr default constructor for hpx::mutex
Issue #5999 - Add HPX Conda package to conda-forge
Issue #5998 - Serializing multiple arguments when applying distributed action results in segfault
Issue #5958 - HPX 1.8.0 and Blaze issues
Issue #5908 - Windows: duplicated symbols in static builds
Issue #5802 - Lost status is_ready from future
Issue #5767 - Performance drop on Piz Daint
Issue #5752 - Implement stride_view from P1899 (experimental)
Issue #5744 - HPX_WITH_FETCH_ASIO not working on Ookami
Issue #5561 - Possible race condition in helper thread / hpx::cout
Closed pull requests#
PR #6228 - Fixing algorithms for zero length sequences when run with s/r scheduler
PR #6227 - Reliably disable background work when no networking is enabled
PR #6225 - Make heap fails in par for small sized heaps #6217
PR #6222 - Add documentation for hpx::post
PR #6221 - Fix segmented algorithms tests
PR #6218 - Creating INSTALL component ‘runtime’ to enable installing binaries only
PR #6216 - added tests for set_difference, updated set_operation.hpp to fix #6198
PR #6213 - Modernize and streamline MPI parcelport
PR #6211 - Modernize modules of level 11, 12, and 13
PR #6210 - Fixing MPI parcelport initialization if MPI is initialized outside of HPX
PR #6209 - Prevent thread stealing during scheduler shutdown
PR #6208 - Fix the compilation warning in the MPI parcelport with gcc 11.2
PR #6207 - Automatically enable Boost.Context when compiling for arm64.
PR #6206 - Update CMakeLists.txt
PR #6205 - Do not generate hpxcxx if support for pkgconfig was disabled
PR #6204 - Use LRT_ instead of LAPP_ logging in barrier implementation
PR #6202 - Fixing Fedora build errors on Power systems
PR #6201 - Update the LCI parcelport documents
PR #6200 - Par link jobs
PR #6197 - LCI parcelport: add doc, upgrade to v1.7.4, refactor cmake autofetch.
PR #6195 - Change the default tag of autofetch LCI to v1.7.3.
PR #6192 - Fix page Writing single-node applications
PR #6189 - Making sure restricted_thread_pool_executor properly reports used number of cores
PR #6187 - Enable using for_loop with range generators
PR #6186 - thread_support/CMakeLists: Fix build issue
PR #6185 - Fix EVE datapar with cxx_standard less than 20
PR #6183 - Update CI integration for EVE
PR #6182 - Fixing performance regressions
PR #6181 - LCI parcelport: backlog queue, aggregation, separate devices, and more
PR #6180 - Fixing use of for_loop with rebound execution policy (using .with())
PR #6179 - Taking predicates for algorithms by value
PR #6178 - Changes needed to make chapel_hpx examples work
PR #6176 - Fixing warnings that were generated by PVS Studio
PR #6174 - Replace boost::integer::gcd with std::gcd
PR #6172 - [Docs] Fix example of how to run single/specific test(s)
PR #6170 - Adding missing fallback for processing_units_count customization point
PR #6169 - LCI parcelport: bypass the parcel queue and connection cache.
PR #6167 - Add create_local_communicator API function
PR #6166 - Add missing header for std::intmax_t
PR #6165 - Attempt to work around MSVC problem
PR #6161 - Update EVE integration
PR #6160 - More cleanup for module levels 0 to 10
PR #6159 - Fix minor spelling mistake in generate_issue_pr_list.sh
PR #6158 - Update documentation in writing single-node applications page
PR #6157 - Improve index_queue_spawning
PR #6154 - Avoid performing late command line handling twice in distributed runtime
PR #6152 - The -rd and -mr options didn’t work, and they should have been –rd and –mr
PR #6151 - Refactoring the Manual page in documentation
PR #6148 - Investigate the failure of the LCI parcelport.
PR #6147 - Make posix co-routine stacks non-executable
PR #6146 - Avoid ambiguities wrt tag_invoke
PR #6144 - General improvements to scheduling and related fixes
PR #6143 - Add list of new namespaces for new release
PR #6140 - Fixing background scheduler to properly exit in the end
PR #6139 - [P2300] execution: Cleanup coroutines integration and improve ADL isolation
PR #6137 - Adding example of a simple master/slave distributed application
PR #6136 - Deprecate execution::experimental::task_group in favor of experimental::task_group
PR #6135 - Fixing warnings reported by MSVC analysis
PR #6134 - Adding notification function for parcelports to be called after early parcel handling
PR #6132 - Fixing to_non_par() for parallel simd policies
PR #6131 - modernize modules from level 25
PR #6130 - Remove the mutex lock in the critical path of get_partitioner.
PR #6129 - Modernize module from levels 22, 23
PR #6127 - Working around gccV9 problem that prevent us from storing enum classes in bit fields
PR #6126 - Deprecate hpx::parallel::task_block in favor of hpx::experimental::ta…
PR #6125 - Making sure sync_wait compiles when used with an lvalue sender involving bulk
PR #6124 - Fixing use of any_sender in combination with when_all
PR #6123 - Fixed issues found by PVS-Studio
PR #6121 - Modernize modules of level 21, 22
PR #6120 - Use index_queue for parallel executors bulk_async_execute
PR #6119 - Update CMakeLists.txt
PR #6118 - Modernize modules from level 17, 18, 19, and 20
PR #6117 - Initialize buffer_allocate_time_ to 0
PR #6116 - Add new command line argument –hpx:loopback_network
PR #6115 - Modernize modules of levels 14, 15, and 16
PR #6114 - Enhance the formatting of the documentation
PR #6113 - Modernize modules in module level 11, 12, and 13
PR #6112 - Modernize modules from levels 9 and 10
PR #6111 - Modernize all modules from module level 8
PR #6110 - Use pragma error directive to report warnings as errors on msvc
PR #6109 - Modernize serialization module
PR #6107 - Modernize error module
PR #6106 - Modernizing modules of levels 0 to 5
PR #6105 - Optimizations on LCI parcelport: merge small messages; remove sender mutex lock.
PR #6104 - Adding parameters API: measure_iteration
PR #6103 - Document task_group and include in Public API
PR #6102 - Prevent warnings generated by clang-cl
PR #6101 - Using more fold expressions
PR #6100 - Deprecate hpx::parallel::reduce_by_key in favor of hpx::experimental::reduce_by_key
PR #6098 - Forking Boost.Lockfree
PR #6096 - Forking Boost.Tokenizer
PR #6095 - Replacing facilities from Boost.Range
PR #6094 - Removing object_semaphore
PR #6093 - Replace boost::string_ref with std::string_view
PR #6092 - Use C++17 static_assert where possible
PR #6091 - Replace artificial sequencing with fold expressions
PR #6090 - Fixing use of get_chunk_size customization point
PR #6088 - Add/fix Public API documentation
PR #6086 - Deprecate hpx::util::unlock_guard in favor of hpx::unlock_guard
PR #6085 - Add experimental sycl integration/executor
PR #6084 - Renaming hpx::apply and friends to hpx::post
PR #6083 - Using if constexpr instead of tag-dispatching, where possible
PR #6082 - Replace util::always_void_t with std::void_t
PR #6081 - Update github actions to avoid warnings
PR #6080 - Disable some tests that fail on LCI
PR #6079 - Adding more natvis files, correct existing
PR #6078 - Changing target name of memory_counters component
PR #6077 - Making default constructor of hpx::mutex constexpr
PR #6076 - Cleaning up functionality that was deprecated in V1.7
PR #6075 - Remove conditional code for gcc V7 and below
PR #6074 - Fixing compilation issues on gcc V8
PR #6073 - Fixing PAPI counter component compilation
PR #6072 - Adding ex::when_all_vector
PR #6071 - Making get_forward_progress_guarantee_t specializations constexpr
PR #6070 - Implement P2690 for our algorithms
PR #6069 - Do not check for cancellation during each iteration but only once per partition
PR #6068 - Prevent using task and non_task as a CPO
PR #6067 - Deprecated hpx::util::mem_fn in favor of hpx::mem_fn
PR #6066 - Create codeql.yml
PR #6064 - Adapting adjacent_difference for S/R execution
PR #6063 - Modernize iterator_support module
PR #6062 - Make sure wrapping executor does not go out of scope prematurely
PR #6061 - Minor fix in small_vector (from upstream)
PR #6060 - Allow to disable registering signal handlers
PR #6059 - [P2300] Fix: declval cannot be ODR used
PR #6058 - Avoid ambiguity for hpx::get used with std::variant
PR #6057 - Create a dedicated thread pool to run LCI_progress.
PR #6056 - Fix coroutine test for clang
PR #6055 - Patches needed to be able to build HPX 1.8.1 on various platforms
PR #6054 - Use MSVC specific attribute [[msvc::no_unique_address]]
PR #6052 - Deprecated hpx::util::invoke_fused in favor of hpx::invoke_fused
PR #6051 - Add non-contiguous index queue and use it in thread_pool_bulk_scheduler
PR #6049 - Crosscompile arm sve
PR #6048 - Deprecated hpx::util::invoke in favor of hpx::invoke
PR #6047 - Separating binary_semaphore into its own file
PR #6046 - Support using unwrapping with nullary function objects
PR #6044 - Generalize the use of then() and dataflow
PR #6043 - Clean up scan_partitioner
PR #6042 - Modernize dataflow API
PR #6041 - docs: document semaphores
PR #6040 - Add/Fix documentation of Public API page
PR #6039 - remove MPI dependency when only using LCI parcelport
PR #6038 - Clean up command line handling
PR #6037 - Avoid performing parcel related background work if networking is disabled
PR #6036 - Support new datapar backend : SVE
PR #6035 - Simplify datapar replace copy if
PR #6034 - Add/Fix documentation of Public API
PR #6033 - Support for data-parallelism for replace, replace_if, replace_copy, replace_copy_if algorithms
PR #6032 - Add documentation in public API
PR #6031 - Expose available cache sizes from topology object
PR #6030 - Adding parcelport initialization hook for resource partitioner operation
PR #6029 - Simplify startup code
PR #6027 - Add/Fix documentation in Public API page
PR #6026 - add option hpx:force_ipv4 to force resolving hostnames to ipv4 adresses
PR #6025 - build(docs): remove leftover sections
PR #6023 - Minor fixes on “How to build on Windows”
PR #6022 - build(doxy): don’t extract private members
PR #6021 - Adding pu_mask to thread_pool_bulk_scheduler
PR #6020 - docs: add cppref NamedRequirements support
PR #6018 - Unseq adaptation for for_each, transform, reduce, transform_reduce, etc.
PR #6017 - loop and transform_loop unseq adaptation
PR #6016 - Config and structural updates to support unseq implementation
PR #6015 - Integrating sync_wait & sync_wait_with_variant
PR #6012 - docs: add missing links to public api
PR #6009 - Fixing sender&receiver integration with for_each and for_loop
PR #6007 - docs: add docs for mutex.hpp
PR #6006 - Relax future::is_ready where possible
PR #6005 - reshuffle header tests to different instances
PR #6004 - Add documentation Public API
PR #6003 - Always exporting get_component_name implementations
PR #6002 - Making sure that default constructble arguments are properly constructed during deserialization
PR #5996 - Add back explicit template parameters to lock_guards for nvcc
PR #5994 - Fix CTRL+C on windows
PR #5993 - Using EVE requires C++20
PR #5992 - This properly terminates an application on Ctrl-C on Windows
PR #5991 - Support IPV6 on command line for explicit network initialization
PR #5990 - P2300 enhancements
PR #5989 - Fix missing documentation in Public API page
PR #5987 - Attempting to fix timed executor API
PR #5986 - Fix warnings when building docs
PR #5985 - Re-add deprecated tag_policy_tag et.al. types that were removed in V1.8.1
PR #5981 - docs: add docs for condition_variable.hpp
PR #5980 - More work on execution::read
PR #5979 - Unsupport clang-v8 and clang-v9, switch LSU clang-v13 to C++17
PR #5977 - fix: Compilation errors for -std=c++17 builders
PR #5975 - docs: fix & improve parallel algorithms documentation 5
PR #5974 - [P2300] Adapt get completion signatures for awaitable senders
PR #5973 - defaults boost.context on riscv64
PR #5972 - Fix documentation for container algorithms
PR #5971 - added logic to detect riscv compiler configured for 64 bit target
PR #5968 - adds risc-v 64 bit support
PR #5967 - Adding missing pieces to sync_wait, adding run_loop
PR #5966 - docs: fix & improve parallel algorithms documentation 4
PR #5965 - Fixing inspect problems, adding missing header file
PR #5962 - Changes in html page of documentation
PR #5961 - Prevent stalling during shutdown when running hello_world_distributed
PR #5955 - Fix documentation for container algorithms
PR #5952 - docs: fix & improve parallel algorithms documentation 3
PR #5950 - Change executors to directly implement the executor CPOs
PR #5949 - Converting async combinators into CPOs
PR #5948 - Adding support for pure sender/receiver based executors to parallel algorithms
PR #5945 - [P2300] Added fundamental coroutine_traits for S/R
PR #5883 - Optimization on LCI parcelport: uses LCI_putva
PR #5872 - Block fork join executor
PR #5855 - Adding performance test Jenkins builder at LSU