HPX V1.7.0 (Jul 14, 2021)
Contents
HPX V1.7.0 (Jul 14, 2021)#
This release is again focused on C++20 conformance of algorithms. Additionally, many new experimental sender-based algorithms have been added based on the latest proposals.
General changes#
The following algorithms have been adapted to be C++20 conformant:
remove
,remove_if
,remove_copy
,remove_copy_if
,replace
,replace_if
,reverse
, andlexicographical_compare
.
When the compiler and standard library support the standard execution policies
std::execution::seq
,std::execution::par
, andstd::execution::par_unseq
they can now be used in all HPX parallel algorithms with equivalent behaviour to the non-task policieshpx::execution::seq
,hpx::execution::par
, andhpx::execution::par_unseq
.Vc support has been fixed, after being broken in 1.6.0. In addition, HPX now experimentally supports GCC’s SIMD implementation, when available. The implementation can be used through the
hpx::execution::simd
andhpx::execution::simdpar
execution policies.The customization points
sync_execute
,async_execute
,then_execute
,post
,bulk_sync_execute
,bulk_async_execute
, andbulk_then_execute
are now implemented usingtag_dispatch
(previouslytag_invoke
). Executors can still be implemented by providing the aforementioned functions as member functions of an executor.New functionality, enhancements, and fixes based on P0443r14 (executors proposal) and P1897 (sender-based algorithms) have been added to the
hpx::execution::experimental
namespace. These can be accessed through thehpx/execution.hpp
andhpx/local/execution.hpp
headers. In particular, the following sender-based algorithms have been added:detach
,ensure_started
,just
,just_on
,let_error
,let_value
,on
,transform
, andwhen_all
.
Additionally, futures now implement the sender concept.
make_future
can be used to turn a sender into a future. All functionality is experimental and can change without notice.All
hpx::init
andhpx::start
overloads now takestd::function
s instead ofhpx::util::function_nonser
. No changes should be required in user code to accommodate this change.hpx::util::unwrapping
and other related unwrapping functionality has been moved up into thehpx
namespace. Names inhpx::util
are still usable with a deprecation warning. This functionality can now be accessed through thehpx/unwrap.hpp
andhpx/local/unwrap.hpp
headers.The default tag for APEX has been update from 2.3.1 to 2.4.0. In particular, this fixes a bug which could lead to hangs in distributed runs.
The dependency on Boost.Asio has been replaced with the standalone Asio available at https://github.com/chriskohlhoff/asio. By default, a system-installed Asio will be used.
ASIO_ROOT
can be given as a hint to tell CMake where to find Asio. Alternatively, Asio can be fetched automatically using CMake’s fetchcontent by settingHPX_WITH_FETCH_ASIO=ON
. In general, dependencies on Boost have again been reduced.Modularization of the library has continued. In this release almost all functionality has been moved into modules. These changes do not generally affect user code. Warnings are still issued for headers that have moved.
hipBLAS is now optional when compiling with
hipcc
. A warning instead of an error will be printed if hipBLAS is not found during configuration.Previously
HPX_COMPUTE_HOST_CODE
was defined in host code only if HPX was configured with CUDA or HIP. In this releaseHPX_COMPUTE_HOST_CODE
is always defined in host code.An experimental
HPX_WITH_PRECOMPILED_HEADERS
CMake option has been added to use precompiled headers when building HPX. This option should not be used on Windows.Numerous bug fixes.
Breaking changes#
The minimum required CMake version is now 3.17.
The minimum required Boost version is now 1.71.0.
The customization mechanism used to implement and extend sender functionality and algorithms has been renamed from
tag_invoke
totag_dispatch
. All customization of sender functionality should be done by overloadingtag_dispatch
.The following compatibility options have been removed, along with their compatibility implementations: -
HPX_PROGRAM_OPTIONS_WITH_BOOST_PROGRAM_OPTIONS_COMPATIBILITY
-HPX_WITH_ACTION_BASE_COMPATIBILITY
-HPX_WITH_EMBEDDED_THREAD_POOLS_COMPATIBILITY
-HPX_WITH_POOL_EXECUTOR_COMPATIBILITY
. -HPX_WITH_PROMISE_ALIAS_COMPATIBILITY
-HPX_WITH_REGISTER_THREAD_COMPATIBILITY
-HPX_WITH_REGISTER_THREAD_OVERLOADS_COMPATIBILITY
-HPX_WITH_THREAD_AWARE_TIMER_COMPATIBILITY
-HPX_WITH_THREAD_EXECUTORS_COMPATIBILITY
-HPX_WITH_THREAD_POOL_OS_EXECUTOR_COMPATIBILITY
The
HPX_WITH_THREAD_SCHEDULERS
CMake option has been removed. All schedulers are now enabled when possible.HPX_WITH_INIT_START_OVERLOADS_COMPATIBILITY
has been turned off by default.
Closed issues#
Issue #5423 - Fix lvalue-ref qualified connect for
when_all-sender
Issue #5412 - Link error
Issue #5397 - Performance regression in thread annotations
Issue #5395 - HPX 1.7.0-rc1 fails to build icw APEX + OTF2
Issue #5385 - HPX 1.7 crashes on Piz Daint > 64 nodes
Issue #5380 - CMake should search for asio package installed on the system
Issue #5378 - HPX 1.7.0 stopped building on Fedora
Issue #5369 - HPX 1.6 and master hangs on Summit for > 64 nodes
Issue #5358 - HPX init fails for single-core environments
Issue #5345 - Rename P2220 property CPOs?
Issue #5333 - HPX does not compile on the new Mac OSX using the M1 chip
Issue #5317 - Consider making hipblas optional
Issue #5306 - asio fails to build with CUDA 10.0
Issue #5294 -
execution::on
should be based onexecution::schedule
Issue #5275 - HPX V1.6.0 fails on Fedora release
Issue #5270 - HPX-1.6.0 fails to build on Windows 10
Issue #5257 - Allow triggering the output of OS thread affinity from configuration settings
Issue #5246 - HPX fails to build on ppc64le
Issue #5232 - Annotation using
hpx::util::annotated_function
not workingIssue #5222 - Build and link errors with ittnotify enabled
Issue #5204 - Move algorithms to tag_fallback_dispatch
Issue #5163 - Remove module-specific compatibility and deprecation options
Issue #5161 - Bump required CMake version to 3.17
Issue #5143 - Searching for HPX-Application to generate work on multiple Nodes
Closed pull requests#
PR #5438 - Delete datapar/foreach_tests.hpp
PR #5437 - Add back explicit -pthread flags when available
PR #5435 - This adds support for systems that assume all types are bitwise serializable by default
PR #5434 - Update CUDA polling logging to be more verbose
PR #5433 - Fix
when_all_sender
connect for referencesPR #5432 - Add deprecation warnings for v1.8
PR #5431 - Rename the new P0443/P2300 executor to
thread_pool_scheduler
PR #5430 - Revert “Adding the missing defined for
HPX_HAVE_DEPRECATION_WARNINGS
”PR #5427 - Removing unneeded typedef
PR #5426 - Adding more concept checks for sender/receiver algorithms
PR #5425 - Adding the missing defined for
HPX_HAVE_DEPRECATION_WARNINGS
PR #5424 - Disable Vc in final docker image created in CI
PR #5422 - Adding
execution::experimental::bulk
algorithmPR #5420 - Update logic to find threading library
PR #5418 - Reduce max size and number of files in ccache cache
PR #5417 - Final release notes for 1.7.0
PR #5416 - Adapt
uninitialized_value_construct
anduninitialized_value_construct_n
to C++ 20PR #5415 - Adapt
uninitialized_default_construct
anduninitialized_default_construct_n
to C++ 20PR #5414 - Improve integration of futures and senders
PR #5413 - Fixing sender/receiver code base to compile with MSVC
PR #5407 - Handle exceptions thrown during initialization of parcel handler
PR #5406 - Simplify dispatching to annotation handlers
PR #5405 - Fetch Asio automatically in perftests CI
PR #5403 - Create generic executor that adds annotations to any other executor
PR #5402 - Adapt
uninitialized_fill
anduninitialized_fill_n
to C++ 20PR #5401 - Modernize a variety of facilities related to parallel algorithms
PR #5400 - Fix sliding semaphore test
PR #5399 - Rename leftover
tag_fallback_invoke
totag_fallback_dispatch
PR #5398 - Improve logging in AGAS symbol namespace
PR #5396 - Introduce compatibility layer for collective operations
PR #5394 - Enable OTF2 in APEX CI configuration
PR #5393 - Update APEX tag
PR #5392 - Fixing wrong usage of
std::forward
PR #5391 - Fix forwarding in transform_receiver constructor
PR #5390 - Make sure shared priority scheduler steals tasks on the current NUMA domain when (core) stealing is enabled
PR #5389 - Adapt
uninitialized_move
anduninitialized_move_n
to C++ 20PR #5388 - Fixing
gather_there
for used with lvalue reference argumentPR #5387 - Extend thread state logging and change default stealing parameters
PR #5386 - Attempt to fix the startup hang with nodes > 32
PR #5384 - Remove HPX 1.5.0 deprecations
PR #5382 - Prefer installed Asio before considering FetchContent
PR #5379 - Allow using pre-downloaded (not installed) versions of Asio and/or Apex
PR #5376 - Remove unnecessary explicit listing of library modules.rst files in CMakeLists.txt
PR #5375 - Slight performance improvement for
hpx::copy
andhpx::move
et.al.PR #5374 - Remove unnecessary moves from future sender implementations
PR #5373 - More changes to clang-cuda Jenkins configuration
PR #5372 - Slight improvements to
min/max/minmax_element
algorithmsPR #5371 - Adapt
uninitialized_copy
anduninitialized_copy_n
to C++ 20PR #5370 - Decay types in
just_sender
value_types
to match stored typesPR #5367 - Disable pkgconfig by default again on macOS
PR #5365 - Use ccache for Jenkins builds on Piz Daint
PR #5363 - Update cudatoolkit module name in clang-cuda Jenkins configuration
PR #5362 - Adding
channel_communicator
PR #5361 - Fix compilation with MPI enabled
PR #5360 - Update APEX and asio tags
PR #5359 - Fix check for pu-step in single-core case
PR #5357 - Making sure collective operations can be reused by preallocating communicator
PR #5356 - Update API documentation
PR #5355 - Make the
sequenced_executor
processing_units_count
member function constPR #5354 - Making sure
default_stack_size
is defined whenever declaredPR #5353 - Add CUDA timestamp support to HPX Hardware Clock
PR #5352 - Adding missing includes
PR #5351 - Adding
enable_logging/disable_logging
API functionsPR #5350 - Adapt lexicographical_compare to C++20
PR #5349 - Update minimum boost version needed on the docs
PR #5348 - Rename
tag_invoke
and related facilities totag_dispatch
PR #5347 - Remove
make_
prefix for executor propertiesPR #5346 - Remove and disable compatibility options for 1.7.0
PR #5343 - Fix timed_executor static cast conversion
PR #5342 - Refactor CUDA event polling
PR #5341 - Adding
make_with_annotation
andget_annotation
propertiesPR #5339 - Making sure
hpx::util::hardware::timestamp()
is always definedPR #5338 - Fixing
timed_executor
specializations of customization pointsPR #5335 - Make
partial_algorithm
work with any number of argumentsPR #5334 - Follow up
iter_sent
include on #5225PR #5332 - Simplify
tag_invoke
and friendsPR #5331 - More work on cleaning up executor CPOs
PR #5330 - Add option to disable pkgconfig generation
PR #5328 - Adapt data parallel support using std-simd
PR #5327 - Fix missing
ifdef HPX_SMT_PAUSE
PR #5326 - Adding
resize()
toserialize_buffer
allowing to shrink its sizePR #5324 - Add get member functions to
async_rw_mutex
proxy objects for explicitly getting the wrapped valuePR #5323 - Add
keep_future
algorithmPR #5322 - Replace executor customization point implementations with
tag_invoke
PR #5321 - Seperate segmented algorithms for reduce
PR #5320 - Fix
is_sender
trait and other small fixes to p0443 traitsPR #5319 - gcc 11.1 c++20 build fixes
PR #5318 - Make hipblas dependency optional as not always available
PR #5316 - Attempt to fix checking for libatomic
PR #5315 - Add explicit keyword to fixture constructor
PR #5314 - Fix a race condition in async mpi affecting limiting executor
PR #5312 - Use local runtime and local headers in local-only modules and tests
PR #5311 - Add GCC 11 builder to jenkins
PR #5310 - Adding
hpx::execution::experimental::task_group
PR #5309 - Seperate datapar
PR #5308 - Seperate segmented algorithms for
find
,find_if
,find_if_not
PR #5307 - Seperate segmented algorithms for
fill
andgenerate
PR #5304 - Fix compilation of sender CPOs with nvcc
PR #5300 - Remove
PRIVATE
flag that was propagated into theLANGUAGES
PR #5298 - Seperate datapar
PR #5297 - Specify exact cmake and ninja versions when loading them in jenkins jobs
PR #5295 - Update clang-newest configuration to use clang 12 and Boost 1.76.0
PR #5293 - Fix Clang 11 cuda_future test bug
PR #5292 - Add
async_rw_mutex
based on sendersPR #5291 - “Fix” termination detection
PR #5290 - Fixed source file line statements in examples documentation
PR #5289 - Allow splitting of futures holding
std::tuple
PR #5288 - Move algorithms to
tag_fallback_invoke
PR #5287 - Move algorithms to
tag_fallback_invoke
PR #5285 - Fix clang-format failure on master
PR #5284 - Replacing
util::function_nonser
on std::function inhpx_init
PR #5282 - Update Boost for daint 20.11 after update
PR #5281 - Fix Segmentation fault on
foreach_datapar_zipiter
PR #5280 - Avoid modulo by zero in
counting_iterator
testPR #5279 - Fix more GCC 10 deprecation warnings
PR #5277 - Small fixes and improvements to CUDA/MPI polling
PR #5276 - Fix typo in docs
PR #5274 - More P1897 algorithms
PR #5273 - Retry CDash submissions on failure
PR #5272 - Fix bogus deprecation warnings with GCC 10
PR #5271 - Correcting target ids for
symbol_namespace::iterate
PR #5268 - Adding generic
require
,require_concept
, andquery
propertiesPR #5267 - Support annotations in
hpx::transform_reduce
PR #5266 - Making late command line options available for local runtime
PR #5265 - Leverage
no_unique_address
formember_pack
PR #5264 - Adopt format in more places
PR #5262 - Install HPX in Rostam Jenkins jobs
PR #5261 - Limit Rostam Jenkins jobs to marvin partition temporarily
PR #5260 - Separate segmented algorithms for transform_reduce
PR #5259 - Making sure late command line options are recognized as configuration options
PR #5258 - Allow for HPX algorithms being invoked with std execution policies
PR #5256 - Separate segmented algorithms for transform
PR #5255 - Future/sender adapters
PR #5254 - Fixing datapar
PR #5253 - Add utility to format ranges
PR #5252 - Remove uses of Boost.Bimap
PR #5251 - Banish
<iostream>
from library headersPR #5250 - Try fixing vc circle ci
PR #5249 - Adding missing header
PR #5248 - Use old Piz Daint modules after upgrade
PR #5247 - Significantly speedup simple
for_each
,for_loop
, andtransform
PR #5245 - P1897
operator|
overloadsPR #5244 - P1897
when_all
PR #5243 - Make sure
HPX_DEBUG
is set based on HPX’s build type, not consuming project’s build typePR #5242 - Moving last files unrelated to parcel layer to modules
PR #5240 - change namespace for
transform_loop.hpp
PR #5238 - Make sure annotations are used in the binary transform
PR #5237 - Add P1897
just
,just_on
, andon
algorithmsPR #5236 - Add an example demonstrating the use of the
invoke_function_action
facilityPR #5235 - Attempting to fix datapar compilation issues
PR #5234 - Fix small typo in
--hpx:local
option descriptionPR #5233 - Only find Boost.Iostreams if required for plugins
PR #5231 - Sort printed config options
PR #5230 - Fix C++20 replace algo adaptation misses
PR #5229 - Remove leftover Boost include from
sync_wait.hpp
PR #5228 - Print module name only if it has custom configuration settings
PR #5227 - Update .codespell_whitelist
PR #5226 - Use new docker image in all CircleCI steps
PR #5225 - Adapt reverse to C++20
PR #5224 - Separate segmented algorithms for
none_of
,any_of
andall_of
PR #5223 - Fixing build system for ittnotify
PR #5221 - Moving LCO related files to modules
PR #5220 - Seperate segmented algorithms for
count
andcount_if
PR #5218 - Seperate segmented algorithms for
adjacent_find
PR #5217 - Add a HIP github action
PR #5215 - Update ROCm to 4.0.1 on Rostam
PR #5214 - Fix clang-format error in sender.hpp
PR #5213 - Removing ESSENTIAL option to the doc example
PR #5212 - Seperate segmented algorithms for
for_each_n
PR #5211 - Minor adapted algos fixes
PR #5210 - Fixing
is_invocable
deprecation warningsPR #5209 - Moving more files into modules (actions, components, init_runtime, etc.)
PR #5208 - Add examples and explanation on when
tag_fallback/priority
are usefulPR #5207 - Always define
HPX_COMPUTE_HOST_CODE
for host codePR #5206 - Add formatting exceptions for libhpx to create_module_skeleton.py
PR #5205 - Moving all distribution policies into modules
PR #5203 - Move copy algorithms to
tag_fallback_invoke
PR #5202 - Make
HPX_WITH_PSEUDO_DEPENDENCIES
a cache variablePR #5201 - Replaced
tag_invoke
withtag_fallback_invoke
foradjacent_find
algorithmPR #5200 - Moving files to (distributed) runtime module
PR #5199 - Update ICC module name on Piz Daint Jenkins configuration
PR #5198 - Add doxygen documentation for thread_schedule_hint
PR #5197 - Attempt to fix compilation of context implementations with unity build enabled
PR #5196 - Re-enable component tests
PR #5195 - Moving files related to colocation logic
PR #5194 - Another attempt at fixing the Fedora 35 problem
PR #5193 - Components module
PR #5192 - Adapt
replace(_if)
to C++20PR #5190 - Set compatibility headers by default to on
PR #5188 - Bump Boost minimum version to 1.71.0
PR #5187 - Force CMake to set the
-std=c++XX
flagPR #5186 - Remove message to print .cu extension whenever .cu files are encountered
PR #5185 - Remove some minor unnecessary CMake options
PR #5184 - Remove some leftover
HPX_WITH_*_SCHEDULER
usesPR #5183 - Remove dependency on boost/iterators/iterator_categories.hpp
PR #5182 - Fixing Fedora 35 for Power architectures
PR #5181 - Bump version number and tag post 1.6.0 release
PR #5180 - Fix htts_v2 tests linking
PR #5179 - Make sure
--hpx:local
command line option is respected with networking is off but distributed runtime is onPR #5177 - Remove module cmake options
PR #5176 - Starting to separate segmented algorithms:
for_each
PR #5174 - Don’t run segmented algorithms twice on CircleCI
PR #5173 - Fetching APEX using cmake FetchContent
PR #5172 - Add separate local-only entry point
PR #5171 - Remove
HPX_WITH_THREAD_SCHEDULERS
CMake optionPR #5170 - Add
HPX_WITH_PRECOMPILED_HEADERS
optionPR #5166 - Moving some action tests to modules
PR #5165 - Require cmake 3.17
PR #5164 - Move
thread_pool_suspension_helper
files to small utility modulePR #5160 - Adding checks ensuring modules are not cross-referenced from other module categories
PR #5158 - Replace boost::asio with standalone asio
PR #5155 - Allow logging when distributed runtime is off
PR #5153 - Components module
PR #5152 - Move more files to performance counter module
PR #5150 - Adapt
remove_copy(_if)
to C++20PR #5144 - AGAS module
PR #5125 - Adapt
remove
andremove_if
to C++20PR #5117 - Attempt to fix segfaults assumed to be caused by
future_data
instances going out of scope.PR #5099 - Allow mixing debug and release builds
PR #5092 - Replace spirit.qi with x3
PR #5053 - Add P0443r14 executor and a a few P1897 algorithms
PR #5044 - Add performance test in jenkins and reports