HPX V1.10.0 (May 29, 2024)
Contents
HPX V1.10.0 (May 29, 2024)#
General changes#
The HPX documentation has seen a major overhaul for this release. We finished documenting the public local HPX API, we have added migration guides from widely used parallelization platforms to HPX (OpenMP, TBB, and MPI).
We have added facilities enabling optimizations for trivially-relocatable types (see P1144 for more details).
We have added (and use) the
scope_xxx
helper facilities as specified by the C++ library fundamentals TS v3 (see: N4948).We have added configuration options that allow to build HPX without pre-installing any prerequisites. Use
HPX_WITH_FETCH_HWLOC=On
to have Portable Hardware Locality (HWLOC) installed for you. Similarly, settingHPX_WITH_FETCH_BOOST=On
during configuration time will install the necessary Boost libraries (currently V1.84.0).We have performed a lot of code cleanup and refactoring to improve the overall code quality and decrease compile times.
The collective operations APIs have seen an unification, we have fixed issues and performance problems for the collectives.
The HPX executors have seen a streamlining and some consistency changes. We have applied many performance improvements to the executor implementations that directly positively impact the performance of our parallel algorithms.
We have added a new parcelport allowing to use Gasnet as a communication platform.
We have added optimizations to various parcelports improving overall communication performance. This includes - amongst other things - send immediate optimizations and receiver-side zero-copy optimizations.
Futures will now execute the associated task eagerly and inline on any wait operation if the task has not started running yet. This feature can be enabled using the
HPX_COROUTINES_WITH_THREAD_SCHEDULE_HINT_RUNS_AS_CHILD=On
configuration setting (which is Off by default).We have enabled using json files to supply configuration information through the command line. This feature can be enabled with the configuration option
HPX_COMMAND_LINE_HANDLING_WITH_JSON_CONFIGURATION_FILES=On
. This functionality depends on the external JSon library, which can be built at configuration time by supplyingHPX_WITH_FETCH_JSON=On
to CMake.We have applied many fixes to our CUDA, ROCm, and SYCL build environments.
Breaking changes#
The CMake configuration keys
SOMELIB_ROOT
(e.g.,BOOST_ROOT
) have been renamed toSomelib_ROOT
(e.g.,Boost_ROOT
) to avoid warnings when using newer versions of CMake. Please update your scripts accordingly. For now, the old variable names are re-assigned to the new names and unset in the CMake cache.
Closed issues#
Issue #6466 - No access limitations to Wiki
Issue #6461 - handle_received_parcels may never return
Issue #6459 - Building HPX
Issue #6451 - HPX hangs at the very end
Issue #6446 - Issue on page /manual/getting_hpx.html
Issue #6443 - PR #6435 (parcel_layer_tweaks) broke Octo-Tiger
Issue #6440 - HPX does not compile with MSVC of Visual Studio 2022 17.9+
Issue #6437 - HPX 1.9.1 does not compile on Fedora with ‘#pragma message: [Parallel STL message]: “Vectorized algorithm unimplemented, redirected to serial
Issue #6419 - Enhancement of the macro functionalities within hpx
Issue #6417 - The current HPX master branch is still not compatible with Kokkos 4.0.1
Issue #6414 - Current HPX master causes segfaults within Octo-Tiger
Issue #6412 - Clangd (Language Server) throws error for __integer_pack at pack.hpp
Issue #6407 - Cannot build Kokkos 4.0.01 with current HPX master
Issue #6405 - Spack Build Error with ROCm 5.7.0
Issue #6398 - HPX sets affinity wrong with multiple processes per node and LCI parcelport enabled
Issue #6392 - [Feature] Install dependencies using CMake
Issue #6388 - HPX error: “Host not found” when running on Expanse with 128 nodes
Issue #6366 - serialize_buffer allocator support needs adjustments
Issue #6361 - HPX 1.9.1 does not compile on Fedora 40
Issue #6355 - Single page documentation is broken
Issue #6334 - Segmentation fault after adding a padding in one_size_heap_list
Issue #6329 - Log hpx threads on forced shutdown
Issue #6316 - Build breaks on FreeBSD
Issue #6299 - HPX does not use distributed localities on Fugaku
Issue #6298 - Update config for coroutines on ARM
Issue #6291 - Zero-copy receive optimization disabled the invocation of direct actions
Issue #6261 - Add optional reading of json files for command line options
Issue #6087 - Support for vcpkg on Linux is broken
Issue #5921 - hpx::info claims that async_mpi was not built, while cmake assures its existence
Issue #5893 - Tests fail on FreeBSD: Executable copyn_test does not exist
Issue #5833 - barrier lockup
Issue #5799 - Investigate CUDA compilation problems
Issue #5340 - Examples do not run on Mac OSX using the M1 chip
Closed pull requests#
PR #6493 - Fix distributed latch documentation
PR #6492 - Fix kokkos hpx nvcc compilation
PR #6491 - More fixes to handling bool arguments for collective operations
PR #6490 - Remove the default max cpu count
PR #6489 - Ensure TCP parcelport is deactivated if not needed
PR #6488 - Fixing handling of bool value type for collective operations
PR #6485 - Destructive interference size
PR #6484 - Improve performance counter error handling
PR #6482 - Generalize the notion of bitwise serialization
PR #6481 - Fixing use of HPX_WITH_CXX_STANDARD
PR #6480 - Remove equal_to from hpx::any
PR #6479 - Remove optimizations for certain built-in compiler intrinsics
PR #6478 - Fixing issues on MacOS
PR #6477 - lci pp: lci’s github repo name changed from LC to lci
PR #6476 - Fixing binary filter test target names
PR #6475 - Fix mac os github actions
PR #6472 - Troubleshoot CI hangs
PR #6469 - improve(lci pp): more options to control the LCI parcelport
PR #6467 - Bump jwlawson/actions-setup-cmake from 1.14 to 2.0
PR #6464 - Update docs of “Writing distributed applications” page
PR #6463 - Revert “Always return outermost thread id”
PR #6458 - Reduce test workload to fix CI/CD time-out
PR #6457 - replace boost::array with std::array and update file name
PR #6456 - Move APEX CI to rostam
PR #6455 - Fixing compilation if HPX_HAVE_THREAD_QUEUE_WAITTIME is defined
PR #6454 - Update perftests reference measurements
PR #6453 - Update supported platforms of Manual/Prerequisites page
PR #6452 - Fix nvcc crashes in transform_stream.cu and synchronize.cu
PR #6450 - Fix git tag name in Getting HPX page
PR #6449 - LCI parcelport: add yield to potentially infinite retry loop
PR #6447 - Use compressed ptr in schedulers when 128 atomics are not lockfree
PR #6445 - Fix agas addressing cache
PR #6444 - Update CTestConfig.cmake
PR #6442 - Update CMakeLists.txt
PR #6441 - Minor documentation fixes
PR #6439 - Optimizing use of certain #includes
PR #6438 - Bump jwlawson/actions-setup-cmake from 1.14 to 2.0
PR #6436 - Update docs
PR #6435 - Parcel layer tweaks
PR #6434 - improve termination detection: removing lock from critical path
PR #6433 - Use shared mutex for resolve_locality procedure
PR #6432 - Module cleanup up to level 30
PR #6429 - Making sure HPX_WITH_ASYNC_MPI is reported properly
PR #6427 - Modifying CMakeLists to copy libhwloc-15.dll to the binary folder in Windows, independently
PR #6425 - Fix macOS failing test
PR #6424 - Adding option for downloading Boost using CMake FetchContent
PR #6423 - Move adjacent_difference to numeric header file
PR #6422 - Adding steal-half functionalities to work-requesting scheduler
PR #6421 - Bump actions/checkout from 2 to 4
PR #6418 - Working around nvcc problems to use CTAD
PR #6416 - Change run_as_os_thread deprecation forwarding due to hipcc compilation issue
PR #6415 - Attempting to avoid segfault in OctoTiger during initialization
PR #6413 - Always return outermost thread id
PR #6411 - Minor refactoring and fixes to the LCI parcelport and pingpong_performance2 benchmark
PR #6410 - Adding scope_xxx from library fundamentals TS v3
PR #6409 - Working around CUDA issue
PR #6408 - Tightening up collective operation semantics
PR #6406 - Working around ROCm compiler issue
PR #6404 - Allow to disable use of [[no_unique_address]] attribute
PR #6403 - Fixing copyright year
PR #6402 - fix(lci pp): fix deadlocks with too many failed sends
PR #6401 - fix(lci pp): fix the null_thread_id bug in the LCI parcelport
PR #6400 - Fix the affinity setting bug when using LCI pp and multiple localities per node
PR #6397 - Change API header titles and info
PR #6396 - Making is_bitwise_serializable SFINAE-friendly
PR #6395 - Adapt amount of collective testing
PR #6394 - Adding option for installing Hwloc using CMake FetchContent
PR #6393 - Optionally disable caching allocator
PR #6391 - Cleaning up collective operations
PR #6390 - Making function local constexpr variables non-static
PR #6389 - Disable resolving hostnames if TCP is disabled
PR #6387 - Need to break out of the loop when searching the suffixes.
PR #6384 - Fixing allocation/deallocation mismatch in serialize_buffer
PR #6383 - Enable fork_join_executor to handle return values from scheduled functions
PR #6381 - Consistently treat conflicting parameters provided by executors and parameter objects
PR #6380 - Fixing setting an annotation for an execution policy
PR #6378 - Allowing to disable signal handlers
PR #6377 - Fix gasnet-related test failures
PR #6375 - Update LSU Jenkins with 2023-10 libraries
PR #6374 - Investigate builder gasnet failure
PR #6373 - Fixing communicator API, adding docs
PR #6372 - Fix resource partitioner tests for small thread count
PR #6371 - Fix jacobi omp examples.
PR #6370 - improve one_size_heap_list: use rwlock to speedup the allocation/free
PR #6369 - working issue with MPI_CC / CC conflict in automake
PR #6368 - Making sure serialize_buffer properly destroys buffer, if needed.
PR #6367 - Fix parallel relocation test
PR #6364 - Relocation variants
PR #6363 - Update the lci parcelport to use LCI v1.7.6
PR #6362 - Fixing compilation problems on 32 Linux systems
PR #6360 - Fix broken links in docs: PDF, Single HTML page, Dependency report
PR #6359 - Fix header file links in Public API page
PR #6358 - Fix CMake find_library for HWLOC
PR #6357 - Replace Custom Benchmarking Code with Nanobench
PR #6356 - Fixed matrix multiplication example output
PR #6354 - Fix broken links for header files in Public API page
PR #6353 - Enable using std::reference_wrapper with executor parameters
PR #6352 - Add Public distributed API documentation
PR #6350 - Make coverage work with Jenkins Github Branch Source plugin
PR #6349 - Moving hpx::threads::run_as_xxx to namespace hpx
PR #6348 - Adding –exclusive to launching tests on rostam
PR #6346 - changed chat link to discord
PR #6344 - uninitialized_relocate w/ type_support primitive
PR #6343 - Bump actions/checkout from 3 to 4
PR #6342 - Fix HPX-APEX cmake integration
PR #6341 - Fix shared_future_continuation_order regression test
PR #6340 - Log alive hpx threads on exit
PR #6339 - Add coverage testing on Jenkins
PR #6338 - Fixing HPX_CURRENT_SOURCE_LOCATION when std::source_location exists
PR #6337 - Remove aurianer, biddisco, and msimberg from codeowners
PR #6336 - More cleaning up for module levels 19-20
PR #6335 - Finalize the MPI docs of the Migration Guide
PR #6332 - More fixes for CMake V3.27
PR #6330 - Adding basic logging to collective operations
PR #6328 - Cleanup previous patch adapting to CMake V3.27
PR #6327 - Modernize modules in level 17 and 18
PR #6324 - P1144 Relocation primitives
PR #6321 - Ensure hpx_main is a proper thread_function
PR #6320 - Fixing cyclic dependencies in naming and agas modules
PR #6319 - Generate git tag if needed but it is not available
PR #6317 - Fixing linker problem on FreeBSD
PR #6315 - acknowledge triv-rel and nothrow-rel types
PR #6314 - Relocation algorithms Clean
PR #6313 - Trivial relocation of c-v-ref-array types
PR #6312 - Fixing warning/error
PR #6311 - Adding executor parallel invoke CPOs
PR #6310 - Define HPX_COMPUTE_CODE in builds with SYCL
PR #6309 - Making sure changed number of cores is propagated to executor
PR #6308 - openshmem-parcelport initial import
PR #6306 - The hpxcxx script was broken such that it could only compile for _release
PR #6305 - Adapting build system for CMake V3.27
PR #6304 - Fixing an integral type mismatch warning
PR #6303 - omp for default vectorization
PR #6301 - Add MPI migration guide
PR #6294 - Add internal reference counting to semaphores
PR #6286 - Simd helpers
PR #6280 - Add TBB to HPX documentation in Migration Guide
PR #6276 - Add dependabot.yml
PR #6275 - Revert “Move dependabot.yml into correct directory”
PR #6272 - set thread name for linux
PR #6271 - Uninitialised algorithms, move using std::memcpy
PR #6270 - Bump jwlawson/actions-setup-cmake from 1.9 to 1.14
PR #6269 - Bump actions/checkout from 2 to 3
PR #6268 - Move dependabot.yml into correct directory
PR #6265 - Create dependabot.yml
PR #6264 - hpx::is_trivially_relocatable trait implementation
PR #6263 - Adding support for reading json configuration files for command line options
PR #6249 - Implement the send immediate optimization for the MPI parcelport.
PR #6237 - Improve compilation performance
PR #6234 - Adding release notes page for next release
PR #6233 - Moving is_relocatable to namespace hpx
PR #6230 - gasnet based parcelport
PR #6226 - Re-enable dependency on segmented algorithms on CircleCI
PR #6220 - Add execution on
PR #6212 - Initial trait definition for relocatable
PR #6199 - added support for unseq, par_unseq for hpx::make_heap algorithm
PR #6173 - C++ modules
PR #6122 - Add Module support
PR #6099 - Futures attempt to execute threads directly if those have not started executing
PR #6050 - Investigating partitioned_vector problems
PR #5988 - Adding CI configuration for DGX-A100 at LSU
PR #5910 - Improve MPI initialization
PR #5845 - Adding local work requesting scheduler that is based on message passing internally